Novel human lysosomal protein and methods of its use

ABSTRACT

The gene associated and causative of classical late infantile neuronal ceroid lipofuscinosis (LINCL), CLN2, has been identified and characterized. The translation product of this gene is a novel protease and a deficiency in this activity results in LINCL. Identification of CLN2 will not only aid in the prevention of LINCL through genetic counseling but provides strategies and test systems for therapeutic intervention. In addition, further characterization of this previously unknown lysosomal enzyme may provide useful insights into other more common human neurodegenerative disorders. Finally, the utility of a general approach for determining the molecular bases for lysosomal disorders of unknown etiology has been demonstrated.

[0001] The research leading to the present invention was supported, inpart, by National Institutes of Health Grants DK45992 and NS30147.Accordingly, the Government may have certain rights in the invention.

FIELD OF THE INVENTION

[0002] The present invention relates to the identification of a gene(CLN2) which, when mutated, results in the neurodegenerative diseaseclassical late infantile neuronal ceroid lipofuscinosis (LINCL). CLN2encodes a pepstatin-insensitive carboxyl protease which is a 46 kDalysomal protein that is absent or mutated in LINCL. Thus, the inventionprovides the protease (CLN2), nucleic acids encoding CLN2,oligonucleotides specific for such nucleic acids, antibodies to CLN2,and methods for restoring the activity of CLN2 to ameliorate thesymptoms of LINCL. Various diagnostic and therapeutic aspects of theinvention particularly relate to detection and treatment of LINCL.

BACKGROUND OF THE INVENTION

[0003] The neuronal ceroid lipofuscinoses (NCLs) are a group of closelyrelated hereditary neurodegenerative disorders which affect infants,children and adults, and which occur at a frequency of between 2 and 4in 100,000 live births (1, 2). Most forms of NCL afflict children andtheir early symptoms and disease progression tend to be similar. Initialdiagnosis is frequently based upon visual problems, behavioral changesand seizures. Progression is reflected by a decline in mental abilities,increasingly severe and untreatable seizures, blindness and loss ofmotor skills while further progression can result in dementia or avegetative state. There is no effective treatment for NCL and allchildhood forms are eventually fatal. Several forms of NCL aredifferentiated according to age of onset, clinical pathology and geneticlinkage. These include infantile NCL (INCL, CLN1), classical lateinfantile NCL (LINCL, CLN2), juvenile NCL (JNCL, CLN3) adult NCL (CLN4),two variant forms of LINCL (CLN5 and CLN6) and possibly other atypicalforms (1,3). The molecular bases for two of these forms of NCL haverecently been identified by positional cloning. Mutations in palmitoylprotein thioesterase (PPT), which removes the lipid moiety from acylatedproteins, results in INCL (4). JNCL results from mutations in the CLN3gene product, a 48 kDa protein of currently unknown function (5). Theidentity of the molecular lesion in LINCL has remained elusive althoughthe disease gene has recently been mapped to chromosome 11 p15 bygenetic linkage analysis (3). There are reasons, however, to suspectthat the CLN2 gene product could have a lysosomal function. First,LINCL, like other forms of NCL, is characterized by an accumulation ofautofluorescent lysosome-like storage bodies in the neurons and othercells of patients. Second, a number of other related neurologicaldisorders are caused by lysosomal deficiencies, e.g. PPT in INCL,neuraminidase in sialidosis and β-hexosaminidase A in Tay-Sachs disease.Third, continuous infusion of leupeptin and other lysosomal proteaseinhibitors into the brains of young rats induces a massive accumulationof ceroid-lipofuscin in neurons that resembles NCL (6,7).

[0004] Thus, there is a need in the art to identify and characterize theCLN2 gene and its gene product (CLN2).

[0005] There is a further need to develop diagnostic and therapeuticapplications, based on CLN2, for prenatal testing and treatment ofLINCL.

[0006] The present invention addresses these and similar needs in theart.

[0007] The citation of any reference herein should not be construed asan admission that such reference is available as prior art to theinvention.

SUMMARY OF THE INVENTION

[0008] Classical late infantile neuronal ceroid lipofuscinosis (LINCL)is a fatal neurodegenerative disease whose defective gene (CLN2) hasremained elusive. The molecular basis for LINCL has been determined hereusing an approach that should be applicable to other lysosomal storagediseases. Using the mannose 6-phosphate carbohydrate modification ofnewly synthesized lysosomal enzymes as an affinity marker, a singlelysosomal enzyme was identified which is absent in LINCL. This proteinwas purified, cloned and sequenced. Sequence comparisons and activitymeasurements suggest that the CLN2 protein is a novelpepstatin-insensitive lysosomal peptidase. In patients, a number ofmutations in the gene encoding this protein were found, confirming it asCLN2.

[0009] A biochemical approach, which relies upon the fact that newlysynthesized soluble lysosomal enzymes contain a modified carbohydrate,mannose 6-phosphate (Man 6-P), was used to identify a protein that isdeficient in LINCL. Man 6-P functions as a targeting signal in vivo asit is recognized by Man 6-P receptors (MPRs) which direct theintracellular vesicular targeting of newly synthesized lysosomal enzymesfrom the Golgi to a prelysosomal compartment (8). Purifiedcation-independent MPR can be used as an affinity reagent for thedetection of immobilized Man 6-P glycoproteins in a Western blot-styleassay or can be coupled as a affinity chromatography reagent for thepurification of Man 6-P glycoproteins (9,10,11). Thus, a preferedembodiment of the invention includes purification of lysosomal proteinsby affinity chromatography using immobilized MPR, followed by peptidesequence analysis, and then use of this sequence information to designnucleic acid probes that can be used for isolation, identification, andcharacterization of lysomal protein genes.

[0010] CLN2 has been identified and the translation product of this geneis a novel protease, which when absent or defective results in LINCL.Identification of CLN2 will not only aid in the prevention of LINCLthrough genetic counseling but will also provide strategies and testsystems for therapeutic intervention. In addition, furthercharacterization of this previously unknown lysosomal enzyme may provideuseful insights into other more common human neurodegenerativedisorders. Furthermore, the utility of a general approach fordetermining the molecular bases for lysosomal disorders of unknownetiology has been demonstrated (22).

[0011] The present invention is broadly directed to an isolated andcharacterized LINCL-associated gene (CLN2) and gene product (CLN2). CLN2is a pepstatin-insensitive carboxyl protease. In a specific embodiment,CLN2 has an amino acid sequence as depicted in FIG. 3 (SEQ ID NO:3). Inanother specific embodiment, CLN2 has a nucleotide sequence as depictedin FIG. 3 (SEQ ID NO:1).

[0012] CLN2 is expressed in healthy individuals. However, LINCL patientshave either no CLN2 or express a defective (mutant) CLN2. Thus, thepresent invention advantageously provides a materials capable ofameliorating LINCL by delivering wild-type CLN2 to LINCL patients eitherthrough gene therapy or a administration of a pharmaceutical preparationof CLN2 or a CLN2 analog.

[0013] The present invention further relates to a chimeric proteincomprising the protein or fragment thereof. In specific embodiments,infra, such a chimeric protein consists of maltose binding protein orpoly-histidine with CLN2. However, the invention specificallycontemplates chimeric proteins comprising a targeting moiety, preferablyan intracellular targeting moiety, with CLN2.

[0014] Naturally, in addition to the isolated protein and fragmentsthereof, the invention provides a purified nucleic acid encoding a CLN2protease, or a fragment thereof having at least 15 nucleotides. In aspecific embodiment, the nucleic acid encodes CLN2 having an amino acidsequence as depicted in FIG. 3 (SEQ ID NO:3). In a more specificembodiment, the nucleic acid has a nucleotide sequence as depicted inFIG. 3 (SEQ ID NO:1). The invention further provides 5′ and 3′non-coding sequences, as depicted in FIG. 3 and SEQ ID NO:1. Theinvention still further provides an alternatively spliced product (stillcoding for the same full-length CLN2 protease), as depicted in FIG. 3and SEQ ID NO:2.

[0015] In a specific embodiment, the purified nucleic acid is DNA. TheDNA may be provided in a recombinant DNA vector. Preferably, the DNAvector is an expression vector, wherein the DNA encoding the CLN2 isoperatively associated with an expression control sequence, wherebytransformation of a host cell with the expression vector provides forexpression of CLN2, or a fragment thereof as set forth above. Thus, theinvention further provides a transformed host cell comprising the DNAvector. In a specific embodiment, the host cell is a bacterial cell. Inanother specific embodiment, the host cell is a mammalian cell.

[0016] The invention further provides a recombinant virus comprising theDNA expression vector. The recombinant virus may be selected from thegroup consisting of a retrovirus, herpes simplex virus (HSV),papillomavirus, Epstein Barr virus (EBV), adenovirus, andadeno-associated virus (AAV).

[0017] Corollary to the recombinant DNA expression vectors, theinvention provides a method for producing a CLN2 comprising expressingthe expression vector in a recombinant host cell of the invention underconditions that provide for expression of the CLN2 The methods ofexpression of the invention may be practiced, for example, in abacterium, or in a mammalian cell.

[0018] The nucleic acids of the invention also provide a method forincreasing the level of expression of a CLN2 Accordingly, an expressionvector may be introduced into a host in vivo under conditions thatprovide for expression of the CLN2. In one embodiment, the expressionvector is a viral expression vector. In another embodiment, theexpression vector is a naked DNA expression vector.

[0019] The invention further provides a method for treating LINCL byincreasing the level of CLN2 in patients with LINCL. In one embodiment,the level of CLN2 is increased by administration of CLN2. In anotherembodiment, the level of CLN2 is increased by administration of arecombinant expression vector to the cells demonstrating uncontrolledproliferation, which expression vector provides for expression of theCLN2 in vivo. In one embodiment, the expression vector is a viralexpression vector; alternatively, the expression vector is a naked DNAexpression vector.

[0020] The present invention provides a protease assay (specific forCLN2 protease) to determine LINCL prognosis and the efficacy of anytherapeutic treatment of the disease.

[0021] In addition to therapeutic aspects, the present inventionprovides oligonucleotides and antibodies for detection of CLN2, anddiagnosis of conditions associated with decreased levels of wild-typeCLN2 expression.

[0022] Thus, in one aspect, the invention provides an oligonucleotide ofgreater than 20 nucleotides which hybridizes under stringent conditionsto the nucleic acid encoding CLN2. Preferably, the oligonucleotidehybridizes under conditions wherein the T_(m) is greater than 60° C.More preferably, the oligonucleotide hybridizes at a T_(m) of greaterthan 65° C. In another embodiment, the oligonucleotide hybridizes at 40%formamide, with 5× or 6×SCC. In a specific embodiment, exemplifiedinfra, the oligonucleotide is an antisense oligonucleotide thathybridizes to CLN2 mRNA.

[0023] In another aspect, the invention provides an antibody specificfor CLN2 The antibody may be polyclonal or monoclonal. In a specificembodiment, exemplified infra, the antibody is a rabbit polyclonalantibody generated against a CLN2 fusion protein. In a specificembodiment, the antibody is labeled, e.g., with a label selected fromthe group consisting of a radioisotope, an enzyme, a chelating agent, afluorophore, a chemiluminescent molecule, and a particle.

[0024] The oligonucleotides and antibodies of the invention can be usedto detect the presence or level of CLN2, or nucleic acids encoding it,in a biological sample. In one embodiment, the invention provides amethod for detecting CLN2 in a biological sample comprising contacting abiological sample with an antibody specific for CLN2 under conditionsthat allow for antibody binding to antigen; and detecting formation ofreaction complexes comprising the antibody and CLN2 in the sample. Thedetection of formation of reaction complexes indicates the presence ofCLN2 in the sample. The level of CLN2 can be quantitated by evaluatingthe amount of reaction complexes formed, wherein the amount of reactioncomplexes corresponds to the level of CLN2 in the biological sample.Alternatively, a method for detecting CLN2 mRNA in a biological samplecomprises contacting a biological sample with an oligonucleotide of theinvention under conditions that allow for hybridization with mRNA; anddetecting hybridization of the oligonucleotide to mRNA in the sample.The detection of hybridization indicates the presence of CLN2 mRNA inthe sample. The level of expression of CLN2 mRNA can be determined byevaluating the quantity of oligonucleotide hybridized, wherein thequantity of oligonucleotide hybridized corresponds to the level of CLN2in the biological sample.

[0025] Thus, a primary object of the invention is to provide a novellysosomal protein that is a pepstatin-insensitive carboxyl protease(CLN2), mutants of which, or absence of, is causative of LINCL.

[0026] Another object of the invention is to provide a nucleic acid,preferably a DNA molecule, coding for such a protein.

[0027] Still another object of the invention is to ameliorate LINCL byadministering CLN2-gene therapy or CLN2 protease, and variants thereof,in a pharmaceutical composition.

[0028] These and other objects of the present invention will be betterunderstood by reference to the following Drawings and the DetailedDescription of the Invention.

BRIEF DESCRIPTION OF THE DRAWINGS

[0029]FIG. 1. A protein deficient in LINCL. Detergent solubilizedextracts of gray matter (50 μg protein) from normal (top) or LINCL(bottom) brain autopsy specimens were fractionated by isoelectricfocusing and SDS-PAGE, transferred to nitrocellulose, and Man 6-Pglycoproteins detected using ¹²⁵I-labeled MPR. The Man 6-P glycoproteinthat is absent in LINCL extracts is arrowed.

[0030]FIG. 2. CLN2 expression in different human tissues. A Northernblot of polyA+ human RNA (CLONTECH, Palo Alto, Calif.) containing 2 μgpolyadenylated RNA was probed with the ³²P-labeled insert of EST37588.Hybridization with two transcripts of approximate size 2.7 and 3.7 kb isevident in all tissues. S. muscle; skeletal muscle.

[0031]FIG. 3. Nucleotide sequence of the human CLN2 mRNA and conceptualamino acid sequence. The nucleotide sequence shown is a compositederived from the complete sequences of 68 ESTs which together covernucleotides 21-3487, a human genomic clone encompassing the entire geneexcept the first 236 nucleotides and two independent PCR products from ahuman cortex cDNA library which encoded the most 5′ 146 nucleotidesincluding the probable initiation codon. An unfilled arrow indicates thepredicted signal cleavage site and a filled arrow indicates the knownN-terminus of the mature/heavy chain. Potential N-linked glycosylationsites are indicated by heavy underlining and the boxed region indicatesthe N-terminal amino acid sequence obtained from the purified protein. *indicates amino acids which are mutated in LINCL patients. Dashedunderlining indicates a likely polyA addition consensus sequence for thelonger transcript and the position of the polyA tail of the shortertranscript is also indicated. Note: there appears to be a polymorphismin the 3′ UTS (S at 2824); of 20 EST sequences examined, 13 were G atthis position and 7 were C.

[0032]FIG. 4. Sequence similarities to CLN2. Aligned sequences of thehuman CLN2 protein, Pseudomonas sp. 101 pepstatin-insensitive carboxylproteinase (PsCP), and Xanthomonas sp. T-22 pepstatin-insensitivecarboxyl proteinase (XaCP). Shading indicates regions of amino acidconservation: heavy shading indicates identical amino acids and lightshading indicates similar amino acids. Predicted and known peptidecleavage sites are indicated by unfilled and filled arrows,respectively. XaCP has a 192 amino acid C-terminal extension (ellipsis)that is proteolytically removed.

[0033]FIG. 5. Enzymatic activity of CLN2. Pepstatin sensitive andinsensitive protease activities in extracts of normal and LINCL brainsamples. Samples were homogenized in 50 volumes (w/v) of 0.15 M NaCl,0.1% Triton X-100 and centrifuged at 14,000×g for 25 min. Pepstatininsensitive activity in the supernatant was measured using 1% bovinehemoglobin as a substrate in 25 mM formate buffer containing 2 μMpepstatin, 0.1 mM E-64, 0.15 M NaCl and 0.1% Triton X-100 pH 3.5. TheTCA soluble degradation products were quantitated with fluorescamine (S.De Bernardo, et al., Archives of Biochemistry and Biophysics 163,390-399 (1974)) in borate buffer pH 8.6. Cathepsin D activity wasmeasured under identical conditions but omitting pepstatin.

DETAILED DESCRIPTION OF THE INVENTION

[0034] The invention provides a novel pepstatin-insensitive carboxylprotease, termed herein CLN2, including biologically active fragmentsthereof.

[0035] For purposes of the present description, the term “isolated”means at the least removed from a natural cellular location. Preferably,CLN2 is purified, so that it comprises at least 50%, preferably at least75%, and more preferably at least 90% of protein (in the case of anucleic acid, of nucleic acids) in a sample.

[0036] A composition comprising “A” (where “A” is a single protein, DNAmolecule, vector, recombinant host cell, etc.) is substantially free of“B” (where “B” comprises one or more contaminating proteins, DNAmolecules, vectors, etc.) when at least about 75% by weight of theproteins, DNA, vectors (depending on the category of species to which Aand B belong) in the composition is “A”. Preferably, “A” comprises atleast about 90% by weight of the A+B species in the composition, mostpreferably at least about 99% by weight. It is also preferred that acomposition, which is substantially free of contamination, contain onlya single molecular weight species having the activity or characteristicof the species of interest.

[0037] In a specific embodiment, the term about means within about 20%,preferably within about 10%, and more preferably within about 5%, of thevalue modified.

[0038] The term “CLN2” (note absence of italics) is interchangeable with“CLN2 protein”, “CLN2 protease”, and “CLN2 pepstatin-insensitivecarboxyl protease”. CLN2 has the amino acid sequence depicted in FIG. 3and in SEQ ID NO:3.

[0039] The term “CLN2” (note presence of italics) is used in referenceto the gene and the mRNA encoding the CLN2 protease. CLN2 has the aminoacid sequence depicted in FIG. 3 and in SEQ ID NO:2. Additionally, analternatively spliced form of the mRNA is depicted in FIG. 3 and in SEQID NO:2.

[0040] The term “LINCL” is an acronym for classical late infantileneuronal ceroid lipofuscinosis.

[0041] In addition to the CLN2 protein and polypeptide fragments, theinvention contemplates chimeric proteins with CLN2 or a fragmentthereof. A CLN2 fusion protein comprises at least a functionally activeportion of a non-CLN2 protein (termed herein the “fusion partner”)joined via a peptide bond to at least a functionally active portion of aCLN2 polypeptide. The non-CLN2 sequences can be amino- orcarboxyl-terminal to the CLN2 sequences. In specific embodiments, infra,CLN2 and the catalytic domain polypeptide fragment of CLN2 are expressedas fusion proteins, in which the fusion partner is maltose bindingprotein or poly-histidine. However, the present invention contemplatesfusion to any protein (or polypeptide), including marker proteins suchas lacZ, signal peptides for extracellular or periplasmic expression,and different nuclear localization peptides, to mention but a fewpossibilities. The invention further contemplates joining CLN2, or apolypeptide fragment domain thereof, with a different protein to createa hybrid fusion protein having different target specificity,particularly targeting for intracellular translocation, catalyticactivity, or other combinations of properties from the CLN2 or fragmentof the invention with the fusion partner. A recombinant DNA moleculeencoding such a fusion protein comprises a sequence encoding at least afunctionally active portion of a non-CLN2 protein joined in-frame to theCLN2 coding sequence, and preferably encodes a cleavage site for aspecific protease, e.g., thrombin or Factor Xa, preferably at theCLN2-non-CLN2 juncture. In a specific embodiment, the fusion protein isexpressed in Escherichia coli.

Genes Encoding CLN2 Protease

[0042] The present invention contemplates isolation of a gene encoding aCLN2 protein of the invention, including a full length, or naturallyoccurring form of CLN2, and any antigenic fragments thereof from anyanimal, particularly mammalian or avian, and more particularly human,source. As used herein, the term “gene” refers to an assembly ofnucleotides that encode a polypeptide, and includes cDNA and genomic DNAnucleic acids.

[0043] Thus, in accordance with the present invention there may beemployed conventional molecular biology, microbiology, and recombinantDNA techniques within the skill of the art. Such techniques areexplained fully in the literature. See, e.g., Sambrook, Fritsch &Maniatis, Molecular Cloning: A Laboratory Manual, Second Edition (1989)Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y. (herein“Sambrook et al., 1989”); DNA Cloning: A Practical Approach, Volumes Iand II (D. N. Glover ed. 1985); Oligonucleotide Synthesis (M. J. Gaited. 1984); Nucleic Acid Hybridization [B. D. Hames & S. J. Higgins eds.(1985)]; Transcription And Translation [B. D. Hames & S. J. Higgins,eds. (1984)]; Animal Cell Culture [R. I. Freshney, ed. (1986)];Immobilized Cells And Enzymes [IRL Press, (1986)]; B. Perbal, APractical Guide To Molecular Cloning (1984); F. M. Ausubel et al.(eds.), Current Protocols in Molecular Biology, John Wiley & Sons, Inc.(1994).

[0044] Therefore, if appearing herein, the following terms shall havethe definitions set out below.

[0045] A “vector” is a replicon, such as plasmid, phage or cosmid, towhich another DNA segment may be attached so as to bring about thereplication of the attached segment. A “replicon” is any genetic element(e.g., plasmid, chromosome, virus) that functions as an autonomous unitof DNA replication, i.e., capable of replication under its own control.

[0046] A cell has been “transfected” by exogenous or heterologous DNAwhen such DNA has been introduced inside the cell. A cell has been“transformed” by exogenous or heterologous DNA when the transfected DNAexpresses mRNA, which preferably is translated into a protein. Usually,expression of such a protein effects a phenotypic or functional changein the cell. However, the protein may be expressed without significantlyeffecting the cell, e.g., in the instance of fermentation of transformedcells for production of a recombinant polypeptide. Preferably, thetransforming DNA should be integrated (covalently linked) intochromosomal DNA making up the genome of the cell.

[0047] “Heterologous” DNA refers to DNA not naturally located in thecell, or in a chromosomal site of the cell. Preferably, the heterologousDNA includes a gene foreign to the cell.

[0048] A “nucleic acid molecule” refers to the phosphate ester polymericform of ribonucleosides (adenosine, guanosine, uridine or cytidine; “RNAmolecules”) or deoxyribonucleosides (deoxyadenosine, deoxyguanosine,deoxythymidine, or deoxycytidine; “DNA molecules”), or any phosphoesteranalogs thereof, such as phosphorothioates and thioesters, in eithersingle stranded form, or a double-stranded helix. Double strandedDNA-DNA, DNA-RNA, and RNA-RNA helices are possible. The term nucleicacid molecule, and in particular DNA or RNA molecule, refers only to theprimary and secondary structure of the molecule, and does not limit itto any particular tertiary forms. Thus, this term includesdouble-stranded DNA found, inter alia, in linear or circular DNAmolecules (e.g., restriction fragments), plasmids, and chromosomes. Indiscussing the structure of particular double-stranded DNA molecules,sequences may be described herein according to the normal convention ofgiving only the sequence in the 5′ to 3′ direction along thenontranscribed strand of DNA (i.e., the strand having a sequencehomologous to the mRNA). A “recombinant DNA molecule” is a DNA moleculethat has undergone a molecular biological manipulation.

[0049] A nucleic acid molecule is “hybridizable” to another nucleic acidmolecule, such as a cDNA, genomic DNA, or RNA, when a single strandedform of the nucleic acid molecule can anneal to the other nucleic acidmolecule under the appropriate conditions of temperature and solutionionic strength (see Sambrook et al., supra). The conditions oftemperature and ionic strength determine the “stringency” of thehybridization. For preliminary screening for homologous nucleic acids,low stringency hybridization conditions, corresponding to a T_(m) of55°, can be used, e.g., 5×SSC, 0.1% SDS, 0.25% milk, and no formamide;or 30% formamide, 5×SSC, 0.5% SDS). Moderate stringency hybridizationconditions correspond to a higher T_(m), e.g., 40% formamide, with 5× or6×SCC. High stringency hybridization conditions correspond to thehighest T_(m), e.g., 50% formamide, 5× or 6×SCC. Hybridization requiresthat the two nucleic acids contain complementary sequences, althoughdepending on the stringency of the hybridization, mismatches betweenbases are possible. The appropriate stringency for hybridizing nucleicacids depends on the length of the nucleic acids and the degree ofcomplementation, variables well known in the art. The greater the degreeof similarity or homology between two nucleotide sequences, the greaterthe value of T_(m) for hybrids of nucleic acids having those sequences.The relative stability (corresponding to higher T_(m)) of nucleic acidhybridizations decreases in the following order: RNA:RNA, DNA:RNA,DNA:DNA. For hybrids of greater than 100 nucleotides in length,equations for calculating T_(m) have been derived (see Sambrook et al.,supra, 9.50-0.51). For hybridization with shorter nucleic acids, i.e.,oligonucleotides, the position of mismatches becomes more important, andthe length of the oligonucleotide determines its specificity (seeSambrook et al., supra, 11.7-11.8). Preferably a minimum length for ahybridizable nucleic acid is at least about 10 nucleotides; morepreferably at least about 15 nucleotides; most preferably the length isat least about 20 nucleotides.

[0050] In a specific embodiment, the term “standard hybridizationconditions” refers to a T_(m) of 55° C. and utilizes conditions as setforth above. In a preferred embodiment, the T_(m) is 60° C.; in a morepreferred embodiment, the T_(m) is 65° C.

[0051] As used herein, the term “oligonucleotide” refers to a nucleicacid, generally of at least 18 nucleotides, that is hybridizable to agenomic DNA molecule, a cDNA molecule, or an mRNA molecule encoding CLN2Oligonucleotides can be labeled, e.g., with ³²P-nucleotides ornucleotides to which a label, such as biotin, has been covalentlyconjugated (see the discussion, supra, with respect to labelingpolypeptides). In one embodiment, a labeled oligonucleotide can be usedas a probe to detect the presence of a nucleic acid encoding CLN2. Inanother embodiment, oligonucleotides (one or both of which may belabeled) can be used as PCR primers, either for cloning full length or afragment of CLN2, or to detect the presence of nucleic acids encodingCLN2 In a further embodiment, an oligonucleotide of the invention canform a triple helix with a CLN2 DNA molecule. Generally,oligonucleotides are prepared synthetically, preferably on a nucleicacid synthesizer. Accordingly, oligonucleotides can be prepared withnon-naturally occurring phosphoester analog bonds, such as thioesterbonds, etc.

[0052] “Homologous recombination” refers to the insertion of a foreignDNA sequence of a vector in a chromosome. Preferably, the vector targetsa specific chromosomal site for homologous recombination. For specifichomologous recombination, the vector will contain sufficiently longregions of homology to sequences of the chromosome to allowcomplementary binding and incorporation of the vector into thechromosome. Longer regions of homology, and greater degrees of sequencesimilarity, may increase the efficiency of homologous recombination.

[0053] A DNA “coding sequence” is a double-stranded DNA sequence whichis transcribed and translated into a polypeptide in a cell in vitro orin vivo when placed under the control of appropriate regulatorysequences. The boundaries of the coding sequence are determined by astart codon at the 5′ (amino) terminus and a translation stop codon atthe 3′ (carboxyl) terminus. A coding sequence can include, but is notlimited to, prokaryotic sequences, cDNA from eukaryotic mRNA, genomicDNA sequences from eukaryotic (e.g., mammalian) DNA, and even syntheticDNA sequences. If the coding sequence is intended for expression in aeukaryotic cell, a polyadenylation signal and transcription terminationsequence will usually be located 3′ to the coding sequence.

[0054] Transcriptional and translational control sequences are DNAregulatory sequences, such as promoters, enhancers, terminators, and thelike, that provide for the expression of a coding sequence in a hostcell. In eukaryotic cells, polyadenylation signals are controlsequences.

[0055] A “promoter sequence” is a DNA regulatory region capable ofbinding RNA polymerase in a cell and initiating transcription of adownstream (3′ direction) coding sequence. For purposes of defining thepresent invention, the promoter sequence is bounded at its 3′ terminusby the transcription initiation site and extends upstream (5′ direction)to include the minimum number of bases or elements necessary to initiatetranscription at levels detectable above background. Within the promotersequence will be found a transcription initiation site (convenientlydefined for example, by mapping with nuclease S1), as well as proteinbinding domains (consensus sequences) responsible for the binding of RNApolymerase.

[0056] A coding sequence is “under the control of”, “operably associatedwith”, or “operatively associated with” transcriptional andtranslational (i.e. expression) control sequences in a cell when RNApolymerase transcribes the coding sequence into mRNA, which is thentrans-RNA spliced and translated into the protein encoded by the codingsequence.

[0057] A “signal sequence” is included at the beginning of the codingsequence of a protein to be expressed on the surface of a cell. Thissequence encodes a signal peptide, N-terminal to the mature polypeptide,that directs the host cell to translocate the polypeptide. The term“translocation signal sequence” is used herein to refer to this sort ofsignal sequence. Translocation signal sequences can be found associatedwith a variety of proteins native to eukaryotes and prokaryotes, and areoften functional in both types of organisms.

[0058] As used herein, the term “sequence homology” in all itsgrammatical forms refers to the relationship between proteins thatpossess a “common evolutionary origin,” including proteins fromsuperfamilies (e.g., the immunoglobulin superfamily) and homologousproteins from different species (e.g., myosin light chain, etc.) (Reecket al., 1987, Cell 50:667).

[0059] Accordingly, the term “sequence similarity” in all itsgrammatical forms refers to the degree of identity or correspondencebetween nucleic acid or amino acid sequences of proteins that do notshare a common evolutionary origin (see Reeck et al., supra). However,in common usage and in the instant application, the term “homologous,”when modified with an adverb such as “highly,” may refer to sequencesimilarity and not a common evolutionary origin.

[0060] In a specific embodiment, two DNA sequences are “substantiallyhomologous” or “substantially similar” when at least about 50%(preferably at least about 75%, and most preferably at least about 90 or95%) of the nucleotides match over the defined length of the DNAsequences. Sequences that are substantially homologous can be identifiedby comparing the sequences using standard software available in sequencedata banks, or in a Southern hybridization experiment under, forexample, stringent conditions as defined for that particular system.Defining appropriate hybridization conditions is within the skill of theart. See, e.g., Maniatis et al., supra; DNA Cloning, Vols. I & II,supra; Nucleic Acid Hybridization, supra.

[0061] Similarly, in a particular embodiment, two amino acid sequencesare “substantially homologous” or “substantially similar” when greaterthan 30% of the amino acids are identical, or greater than about 60% aresimilar (functionally identical). Preferably, the similar or homologoussequences are identified by alignment using, for example, the GCG(Genetics Computer Group, Program Manual for the GCG Package, Version 7,Madison, Wis.) pileup program.

[0062] The term “corresponding to” is used herein to refer similar orhomologous sequences, whether the exact position is identical ordifferent from the molecule to which the similarity or homology ismeasured. Thus, the term “corresponding to” refers to the sequencesimilarity, and not the numbering of the amino acid residues ornucleotide bases.

[0063] A gene encoding CLN2, whether genomic DNA or cDNA, can beisolated from any source, particularly from a human cDNA or genomiclibrary. Methods for obtaining the CLN2 gene are well known in the art,as described above (see, e.g., Sambrook et al., 1989, supra).

[0064] Accordingly, any animal cell potentially can serve as the nucleicacid source for the molecular cloning of a CLN2 gene. The DNA may beobtained by standard procedures known in the art from cloned DNA (e.g.,a DNA “library”), and preferably is obtained from a cDNA libraryprepared from tissues with high level expression of the protein, bychemical synthesis, by cDNA cloning, or by the cloning of genomic DNA,or fragments thereof, purified from the desired cell (See, for example,Sambrook et al., 1989, supra; Glover, D. M. (ed.), 1985, DNA Cloning: APractical Approach, MRL Press, Ltd., Oxford, U.K. Vol. I, II). Clonesderived from genomic DNA may contain regulatory and intron DNA regionsin addition to coding regions; clones derived from cDNA will not containintron sequences. Whatever the source, the gene should be molecularlycloned into a suitable vector for propagation of the gene.

[0065] In the molecular cloning of the gene from genomic DNA, DNAfragments are generated, some of which will encode the desired gene. TheDNA may be cleaved at specific sites using various restriction enzymes.Alternatively, one may use DNAse in the presence of manganese tofragment the DNA, or the DNA can be physically sheared, as for example,by sonication. The linear DNA fragments can then be separated accordingto size by standard techniques, including but not limited to, agaroseand polyacrylamide gel electrophoresis and column chromatography.

[0066] Once the DNA fragments are generated, identification of thespecific DNA fragment containing the desired CLN2 gene may beaccomplished in a number of ways. For example, if an amount of a portionof a CLN2 gene or its specific RNA, or a fragment thereof, is availableand can be purified and labeled, the generated DNA fragments may bescreened by nucleic acid hybridization to the labeled probe (Benton andDavis, 1977, Science 196:180; Grunstein and Hogness, 1975, Proc. Natl.Acad. Sci. U.S.A. 72:3961). For example, a set of oligonucleotidescorresponding to the cDNA for the CLN2 protein can be prepared and usedas probes for DNA encoding CLN2, as was done in a specific example,infra, or as primers for cDNA or mRNA (e.g., in combination with apoly-T primer for RT-PCR). Preferably, a fragment is selected that ishighly unique to CLN2 of the invention. Those DNA fragments withsubstantial sequence similarity to the probe will hybridize. As notedabove, the greater the degree of sequence similarity, the more stringenthybridization conditions can be used. In a specific embodiment, lowstringency hybridization conditions (50° C., 50% formamide, 5×SSC,5×Denhardts solution) can be used to identify a homologous CLN2 gene,preferably a human CLN2 gene, using a murine CLN2 cDNA probe.

[0067] Further selection can be carried out on the basis of theproperties of the gene, e.g., if the gene encodes a protein producthaving the isoelectric, electrophoretic, amino acid composition,uniquely characteristic set of structural domains, or partial amino acidsequence of CLN2 protein as disclosed herein. Thus, the presence of thegene may be detected by assays based on the physical, chemical, orimmunological properties of its expressed product. For example, therabbit polyclonal antibody to murine or human CLN2, described in detailinfra, may be used to confirm expression of CLN2. In another aspect, aprotein that has an apparent molecular weight of ⁻46 kDa, and which isbiochemically determined to have a pepstatin-insensitive carboxylprotease activity, is a good candidate for CLN2.

[0068] A prefered embodiment of the invention comprises a novel methodfor identifying genes which encode lysosomal proteins. This methodrelies on the observation that all lysosomal enzymes are glycosylatedwith mannose 6-phosphate (Man 6-P). Therefore, these proteins can bereadily purified using an affinity chromatography matrix comprised ofthe mannose 6-phosphate receptor (MPR) (which also has functionality, inthe form of enzyme- or radio-labeled conjugates, for visualization inblotting applications) immobilized on a solid support. Proteins purifiedon this affinity matrix can be sequenced and thus yield the criticalinformation for designing nucleic acid probes for use in isolation andidentification of the gene.

[0069] The present invention also relates to cloning vectors containinggenes encoding CLN2, active fragments thereof, analogs, and derivativesof CLN2 of the invention, that have the same or homologous functionalactivity as CLN2, and homologs thereof from other species. Theproduction and use of derivatives and analogs related to CLN2 are withinthe scope of the present invention. For example, a fragmentcorresponding to the catalytic domain exhibits enzymatic activity. In aspecific embodiment, the derivative or analog is functionally active,i.e., capable of exhibiting one or more functional activities associatedwith a full-length, wild-type CLN2 of the invention.

[0070] CLN2 derivatives can be made by altering encoding nucleic acidsequences by substitutions, additions or deletions that provide forfunctionally equivalent molecules. Preferably, derivatives are made thathave enhanced or increased functional activity relative to native CLN2

[0071] Due to the degeneracy of nucleotide coding sequences, other DNAsequences which encode substantially the same amino acid sequence as aCLN2 gene may be used in the practice of the present invention. Theseinclude but are not limited to allelic genes, homologous genes fromother species, and nucleotide sequences comprising all or portions ofCLN2 genes which are altered by the substitution of different codonsthat encode the same amino acid residue within the sequence, thusproducing a silent change. Likewise, the CLN2 derivatives of theinvention include, but are not limited to, those containing, as aprimary amino acid sequence, all or part of the amino acid sequence of aCLN2 protein including altered sequences in which functionallyequivalent amino acid residues are substituted for residues within thesequence resulting in a conservative amino acid substitution. Forexample, one or more amino acid residues within the sequence can besubstituted by another amino acid of a similar polarity, which acts as afunctional equivalent, resulting in a silent alteration. Substitutes foran amino acid within the sequence may be selected from other members ofthe class to which the amino acid belongs. For example, the nonpolar(hydrophobic) amino acids include alanine, leucine, isoleucine, valine,proline, phenylalanine, tryptophan and methionine. Amino acidscontaining aromatic ring structures are phenylalanine, tryptophan, andtyrosine. The polar neutral amino acids include glycine, serine,threonine, cysteine, tyrosine, asparagine, and glutamine. The positivelycharged (basic) amino acids include arginine, lysine and histidine. Thenegatively charged (acidic) amino acids include aspartic acid andglutamic acid. Such alterations will not be expected to affect apparentmolecular weight as determined by polyacrylamide gel electrophoresis, orisoelectric point.

[0072] Particularly preferred substitutions are:

[0073] Lys for Arg and vice versa such that a positive charge may bemaintained;

[0074] Glu for Asp and vice versa such that a negative charge may bemaintained;

[0075] Ser for Thr such that a free —OH can be maintained; and

[0076] Gin for Asn such that a free NH₂ can be maintained.

[0077] Substitutions of glu tor asp and visa versa, or “switching” acidamino acid residues with other residues, while retaining the totalnumber of acidic residues in the acidic domain, are expected to retainthe functional activity of that domain.

[0078] Amino acid substitutions may also be introduced to substitute anamino acid with a particularly preferable property. For example, a Cysmay be introduced a potential site for disulfide bridges with anotherCys. A His may be introduced as a particularly “catalytic” site (i.e.,His can act as an acid or base and is the most common amino acid inbiochemical catalysis). Pro may be introduced because of itsparticularly planar structure, which induces β-turns in the protein'sstructure.

[0079] The genes encoding CLN2 derivatives and analogs of the inventioncan be produced by various methods known in the art. The manipulationswhich result in their production can occur at the gene or protein level.For example, the cloned CLN2 gene sequence can be modified by any ofnumerous strategies known in the art (Sambrook et al., 1989, supra). Thesequence can be cleaved at appropriate sites with restrictionendonuclease(s), followed by further enzymatic modification if desired,isolated, and ligated in vitro. In the production of the gene encoding aderivative or analog of CLN2, care should be taken to ensure that themodified gene remains within the same translational reading frame as theCLN2 gene, uninterrupted by translational stop signals, in the generegion where the desired activity is encoded.

[0080] Additionally, the CLN2-encoding nucleic acid sequence can bemutated in vitro or in vivo, to create and/or destroy translation,initiation, and/or termination sequences, or to create variations incoding regions and/or form new restriction endonuclease sites or destroypreexisting ones, to facilitate further in vitro modification.Preferably, such mutations enhance the functional activity of themutated CLN2 gene product. Any technique for mutagenesis known in theart can be used, including but not limited to, in vitro site-directedmutagenesis (Hutchinson, C., et al., 1978, J. Biol. Chem. 253:6551;Zoller and Smith, 1984, DNA 3:479-488; Oliphant et al., 1986, Gene44:177; Hutchinson et al., 1986, Proc. Natl. Acad. Sci. U.S.A. 83:710),use of TAB linkers (Pharmacia), etc. PCR techniques are preferred forsite directed mutagenesis (see Higuchi, 1989, “Using PCR to EngineerDNA”, in PCR Technology: Principles and Applications for DNAAmplification, H. Erlich, ed., Stockton Press, Chapter 6, pp. 61-70).

[0081] The identified and isolated gene can then be inserted into anappropriate cloning vector. A large number of vector-host systems knownin the art may be used. Possible vectors include, but are not limitedto, plasmids or modified viruses, but the vector system must becompatible with the host cell used. Examples of vectors include, but arenot limited to, E. coli, bacteriophages such as lambda derivatives, orplasmids such as pBR322 derivatives or pUC plasmid derivatives, e.g.,pGEX vectors, pMal-c, pFLAG, etc. The insertion into a cloning vectorcan, for example, be accomplished by ligating the DNA fragment into acloning vector which has complementary cohesive termini. However, if thecomplementary restriction sites used to fragment the DNA are not presentin the cloning vector, the ends of the DNA molecules may beenzymatically modified. Alternatively, any site desired may be producedby ligating nucleotide sequences (linkers) onto the DNA termini; theseligated linkers may comprise specific chemically synthesizedoligonucleotides encoding restriction endonuclease recognitionsequences. Recombinant molecules can be introduced into host cells viatransformation, transfection, infection, electroporation, etc., so thatmany copies of the gene sequence are generated. Preferably, the clonedgene is contained on a shuttle vector plasmid, which provides forexpansion in a cloning cell, e.g., E. coli, and facile purification forsubsequent insertion into an appropriate expression cell line, if suchis desired. For example, a shuttle vector, which is a vector that canreplicate in more than one type of organism, can be prepared forreplication in both E. coli and Saccharomyces cerevisiae by linkingsequences from an E. coli plasmid with sequences form the yeast 2μplasmid.

[0082] The present invention extends to the preparation of antisensenucleotides, including ribozymes, that may be used to detect thepresence of mRNA coding for CLN2 or interfere with the expression ofCLN2 at the translational level. This approach utilizes antisensenucleic acid and ribozymes to hybridize to CLN2 mRNA, which can blocktranslation of a specific mRNA, either by masking that mRNA with anantisense nucleic acid or cleaving it with a ribozyme.

[0083] Antisense nucleic acids are DNA or RNA molecules that arecomplementary to at least a portion of a specific mRNA molecule (seeMarcus-Sekura, 1988, Anal. Biochem. 172:298). In the cell, theyhybridize to that mRNA, forming a double stranded molecule. The celldoes not translate an mRNA in this double-stranded form. Therefore,antisense nucleic acids interfere with the expression of mRNA intoprotein. Oligomers of about fifteen nucleotides and molecules thathybridize to the AUG initiation codon will be particularly efficient,since they are easy to synthesize and are likely to pose fewer problemsthan larger molecules when introducing them into organ cells. Antisensemethods have been used to inhibit the expression of many genes in vitro(Marcus-Sekura, 1988, supra; Hambor et al., 1988, J. Exp. Med.168:1237). Preferably synthetic antisense nucleotides containphosphoester analogs, such as phosphorothioates, or thioesters, ratherthan natural phophoester bonds. Such phosphoester bond analogs are moreresistant to degradation, increasing the stability, and therefore theefficacy, of the antisense nucleic acids.

[0084] Ribozymes are RNA molecules possessing the ability tospecifically cleave other single stranded RNA molecules in a mannersomewhat analogous to DNA restriction endonucleases. Ribozymes werediscovered from the observation that certain mRNAs have the ability toexcise their own introns. By modifying the nucleotide sequence of theseRNAs, researchers have been able to engineer molecules that recognizespecific nucleotide sequences in an RNA molecule and cleave it (Cech.1988, J. Med. Assoc. 260:3030). Because they are sequence-specific, onlymRNAs with particular sequences are inactivated.

[0085] Investigators have identified two types of ribozymes,Tetrahymena-type and “hammerhead”-type (Hasselhoff and Gerlach, 1988).Tetrahymena-type ribozymes recognize four-base sequences, while“hammerhead”-type recognize eleven- to eighteen-base sequences. Thelonger the recognition sequence, the more likely it is to occurexclusively in the target mRNA species. Therefore, hammerhead-typeribozymes are preferable to Tetrahymena-type ribozymes for inactivatinga specific mRNA species, and eighteen base recognition sequences arepreferable to shorter recognition sequences.

[0086] The DNA sequences encoding CLN2, and variants (e.g. mutantsassociated with LINCL) thereof, described and enabled herein may thus beused to prepare antisense molecules that hybridize to and ribozymes thatcleave mRNAs for CLN2, thus inhibiting expression of the gene encodingCLN2. A prefered embodiment would entail targeting mutant alleles of theCLN2 gene associated with LINCL.

Expression of CLN2 Proteins

[0087] The nucleotide sequence coding for CLN2, or antigenic fragment,derivative or analog thereof, or a functionally active derivative,including a chimeric protein, thereof, can be inserted into anappropriate expression vector, i.e., a vector which contains thenecessary elements for the transcription and translation of the insertedprotein-coding sequence. Such elements are termed herein a “promoter.”Thus, the nucleic acid encoding CLN2 of the invention is operablyassociated with a promoter in an expression vector of the invention.Both cDNA and genomic sequences can be cloned and expressed undercontrol of such regulatory sequences. An expression vector alsopreferably includes a replication origin, unless the vector is intendedfor homologous recombination.

[0088] The necessary transcriptional and translational signals can beprovided on a recombinant expression vector, or they may be supplied bythe native gene encoding CLN2 and/or its flanking regions.

[0089] As pointed out above, potential chimeric partners for CLN2include substitute catalytic domains, or a different nuclear targetingdomain.

[0090] Potential host-vector systems include but are not limited tomammalian cell systems infected with virus (e.g., vaccinia virus,adenovirus, etc.); insect cell systems infected with virus (e.g.,baculovirus); microorganisms such as yeast containing yeast vectors; orbacteria transformed with bacteriophage, DNA, plasmid DNA, or cosmidDNA. The expression elements of vectors vary in their strengths andspecificities. Depending on the host-vector system utilized, any one ofa number of suitable transcription and translation elements may be used.

[0091] A recombinant CLN2 protein of the invention, or functionalfragment, derivative, chimeric construct, or analog thereof, may beexpressed chromosomally, after integration of the coding sequence byrecombination. In this regard, any of a number of amplification systemsmay be used to achieve high levels of stable gene expression (SeeSambrook et al., 1989, supra).

[0092] The cell into which the recombinant vector comprising the nucleicacid encoding CLN2 is cultured in an appropriate cell culture mediumunder conditions that provide for expression of CLN2 by the cell.

[0093] Any of the methods previously described for the insertion of DNAfragments into a cloning vector may be used to construct expressionvectors containing a gene consisting of appropriatetranscriptional/translational control signals and the protein codingsequences. These methods may include in vitro recombinant DNA andsynthetic techniques and in vivo recombination (genetic recombination).

[0094] Expression of CLN2 protein may be controlled by anypromoter/enhancer element known in the art, but these regulatoryelements must be functional in the host selected for expression.Promoters which may be used to control CLN2 gene expression include, butare not limited to, the SV40 early promoter region (Benoist and Chambon,1981, Nature 290:304-310), the promoter contained in the 3′ longterminal repeat of Rous sarcoma virus (Yamamoto, et al., 1980, Cell22:787-797), the herpes thymidine kinase promoter (Wagner et al., 1981,Proc. Natl. Acad. Sci. U.S.A. 78:1441-1445), the regulatory sequences ofthe metallothionein gene (Brinster et al., 1982, Nature 296:39-42);prokaryotic expression vectors such as the β-lactamase promoter(Villa-Kamaroff, et al., 1978, Proc. Natl. Acad. Sci. U.S.A.75:3727-3731), or the tac promoter (DeBoer, et al., 1983, Proc. Natl.Acad. Sci. U.S.A. 80:21-25); see also “Useful proteins from recombinantbacteria” in Scientific American, 1980, 242:74-94; promoter elementsfrom yeast or other fungi such as the Gal 4 promoter, the ADC (alcoholdehydrogenase) promoter, PGK (phosphoglycerol kinase) promoter, alkalinephosphatase promoter; and the animal transcriptional control regions,which exhibit tissue specificity and have been utilized in transgenicanimals: elastase I gene control region which is active in pancreaticacinar cells (Swift et al., 1984, Cell 38:639-646; Ornitz et al., 1986,Cold Spring Harbor Symp. Quant. Biol. 50:399-409; MacDonald, 1987,Hepatology 7:425-515); insulin gene control region which is active inpancreatic beta cells (Hanahan, 1985, Nature 315:115-122),immunoglobulin gene control region which is active in lymphoid cells(Grosschedl et al., 1984, Cell 38:647-658; Adames et al., 1985, Nature318:533-538; Alexander et al., 1987, Mol. Cell. Biol. 7:1436-1444),mouse mammary tumor virus control region which is active in testicular,breast, lymphoid and mast cells (Leder et al., 1986, Cell 45:485-495),albumin gene control region which is active in liver (Pinkert et al.,1987, Genes and Devel. 1:268-276), alpha-fetoprotein gene control regionwhich is active in liver (Krumlauf et al., 1985, Mol. Cell. Biol.5:1639-1648; Hammer et al., 1987, Science 235:53-58), alpha1-antitrypsin gene control region which is active in the liver (Kelseyet al., 1987, Genes and Devel. 1:161-171), beta-globin gene controlregion which is active in myeloid cells (Mogram et al., 1985, Nature315:338-340; Kollias et al., 1986, Cell 46:89-94), myelin basic proteingene control region which is active in oligodendrocyte cells in thebrain (Readhead et al., 1987, Cell 48:703-712), myosin light chain-2gene control region which is active in skeletal muscle (Sani, 1985,Nature 314:283-286), and gonadotropic releasing hormone gene controlregion which is active in the hypothalamus (Mason et al., 1986, Science234:1372-1378).

[0095] Expression vectors containing a nucleic acid encoding a CLN2 ofthe invention can be identified by four general approaches: (a) PCRamplification of the desired plasmid DNA or specific mRNA, (b) nucleicacid hybridization, (c) presence or absence of selection marker genefunctions, (d) analysis with appropriate restriction endonucleases, and(e) expression of inserted sequences. In the first approach, the nucleicacids can be amplified by PCR to provide for detection of the amplifiedproduct. In the second approach, the presence of a foreign gene insertedin an expression vector can be detected by nucleic acid hybridizationusing probes comprising sequences that are homologous to an insertedmarker gene. In the third approach, the recombinant vector/host systemcan be identified and selected based upon the presence or absence ofcertain “selection marker” gene functions (e.g., β-galactosidaseactivity, thymidine kinase activity, resistance to antibiotics,transformation phenotype, occlusion body formation in baculovirus, etc.)caused by the insertion of foreign genes in the vector. In anotherexample, if the nucleic acid encoding CLN2 is inserted within the“selection marker” gene sequence of the vector, recombinants containingthe CLN2 insert can be identified by the absence of reading frames),pAc360 (BamHI cloning site 36 base pairs downstream of a polyhedrininitiation codon: Invitrogen(195)), and pBlueBacHisA, B, C (threedifferent reading frames, with BamH1, BglII, PstI, NcoI, and HindIIIcloning site, an N-terminal peptide for ProBond purification, andblue/white recombinant screening of plaques: Invitrogen (220)) can beused.

[0096] Mammalian expression vectors contemplated for use in theinvention include vectors with inducible promoters, such as thedihydrofolate reductase (DHFR) promoter, e.g., any expression vectorwith a DHFR expression vector, or a DHFR/methotrexate co-amplificationvector, such as pED (PstI, SalI, SbaI, SmaI, and EcoRI cloning site,with the vector expressing both the cloned gene and DHFR; see Kaufman,Current Protocols in Molecular Biology, 16.12 (1991). Alternatively, aglutamine synthetase/methionine sulfoximine co-amplification vector,such as pEE14 (HindIII, XbaI, SmaI, SbaI, EcoRI, and BclI cloning site,in which the vector expresses glutamine synthase and the cloned gene;Celltech). In another embodiment, a vector that directs episomalexpression under control of Epstein Barr Virus (EBV) can be used, suchas pREP4 (BamH1, SfiI, XhoI, NotI, NheI, HindIII, NheI, PvuII, and KpnIcloning site, constitutive RSV-LTR promoter, hygromycin selectablemarker; Invitrogen), pCEP4 (BamH1, SfiI, XhoI, NotI, NheI, HindIII,NheI, PvuII, and KpnI cloning site, constitutive hCMV immediate earlygene, hygromycin selectable marker; Invitrogen), pMEP4 (KpnI, PvuI,NheI, HindIII, NotI, XhoI, SfiI, BamH1 cloning site, induciblemetallothionein IIa gene promoter, hygromycin selectable marker:Invitrogen), pREP8 (BamH1, XhoI, NotI, HindIII, NheI, and KpnI cloningsite, RSV-LTR promoter, histidinol selectable marker; Invitrogen), pREP9(KpnI, NheI, HindIII, NotI, XhoI, SfiI, and BamHI cloning site, RSV-LTRpromoter, G418 selectable marker; Invitrogen), and pEBVHis (RSV-LTRpromoter, hygromycin selectable marker, N-terminal peptide purifiablevia ProBond resin and cleaved by enterokinase; Invitrogen). Selectablemammalian expression vectors for use in the invention include pRc/CMV(HindIII, BstXI, NotI, SbaI, and ApaI cloning site, G418 selection;Invitrogen), pRc/RSV (HindIII, SpeI, BstXI, NotI, XbaI cloning site,G418 selection; Invitrogen), and others. Vaccinia virus mammalianexpression vectors (see, Kaufman. 1991, supra) for use according to theinvention include but are not limited to pSC11 (SmaI cloning site. TK-and β-gal selection), pMJ601 (SalI, SmaI, AflI, NarI, BspMII, BamHI,ApaI, NheI, SacII, KpnI, and HindIII cloning site; TK- and β-galselection), and pTKgptF1S (EcoRI, PstI, SalI, AccI, HindII, SbaI, BamHI,and Hpa cloning site, TK or XPRT selection).

[0097] Yeast expression systems can also be used according to theinvention to express OB polypeptide. For example, the non-fusion pYES2vector (XbaI, SphI, ShoI, NotI, GstXI, EcoRI, BstXI, BamH1, SacI, Kpn1,and HindIII cloning sit; Invitrogen) or the fusion pYESHisA, B, C (XbaI,SphI, ShoI, NotI, BstXI, EcoRI, BamH1, SacI, KpnI, and HindIII cloningsite, N-terminal peptide purified with ProBond resin and cleaved withenterokinase; Invitrogen), to mention just two, can be employedaccording to the invention.

[0098] Once a particular recombinant DNA molecule is identified andisolated, several methods known in the art may be used to propagate it.Once a suitable host system and growth conditions are established,recombinant expression vectors can be propagated and prepared inquantity. As previously explained, the expression vectors which can beused include, but are not limited to, the following vectors or theirderivatives: human or animal viruses such as vaccinia virus oradenovirus; insect viruses such as baculovirus; yeast vectors;bacteriophage vectors (e.g., lambda), and plasmid and cosmid DNAvectors, to name but a few.

[0099] In addition, a host cell strain may be chosen which modulates theexpression of the inserted sequences, or modifies and processes the geneproduct in the specific fashion desired. Different host cells havecharacteristic and specific mechanisms for the translational andpost-translational processing and modification (e.g., glycosylation,cleavage [e.g., of signal sequence]) of proteins. Appropriate cell linesor host systems can be chosen to ensure the desired modification andprocessing of the foreign protein expressed. For example, expression ina bacterial system can be used to produce an nonglycosylated coreprotein product. Expression in yeast can produce a glycosylated product.Expression in eukaryotic cells can increase the likelihood of “native”folding of a heterologous protein. Moreover, expression in mammaliancells can provide a tool for reconstituting, or constituting, CLN2activity. Furthermore, different vector/host expression systems mayaffect processing reactions, such as proteolytic cleavages, to adifferent extent.

[0100] Vectors are introduced into the desired host cells by methodsknown in the art, e.g., transfection, electroporation, microinjection,transduction, cell fusion, DEAE dextran, calcium phosphateprecipitation, lipofection (lysosome fusion), use of a gene gun(biolistics), or a DNA vector transporter (see, e.g., Wu et al., 1992,J. Biol. Chem. 267:963-967; Wu and Wu, 1988, J. Biol. Chem.263:14621-14624; Hartmut et al., Canadian Patent Application No.2,012,311, filed Mar. 15, 1990).

Antibodies to CLN2

[0101] According to the invention, CLN2 protein purified from naturalsources, produced recombinantly or by chemical synthesis, and fragmentsor other derivatives or analogs thereof, including fusion proteins, maybe used as an immunogen to generate antibodies that recognize the CLN2protein or mutant variants associated with LINCL. Such antibodies arereferred to a specific for CLN2, or characterized by specific binding toCLN2. Such antibodies include but are not limited to polyclonal,monoclonal, chimeric, single chain, Fab fragments, and an Fab expressionlibrary. In specific embodiments, infra, a CLN2-poly-histidine fusionprotein, and a CLN2-maltose binding protein (MBP) fusion protein wereused as antigens. The anti-CLN2 antibodies of the invention may be crossreactive, e.g., they may recognize CLN2 from different species.Polyclonal antibodies have greater likelihood of cross reactivity.Alternatively, an antibody of the invention may be specific for a singleform of CLN2, such as murine CLN2. Preferably, such an antibody isspecific for human CLN2.

[0102] Various procedures known in the art may be used for theproduction of polyclonal antibodies to CLN2 protein a recombinant CLN2or derivative or analog thereof. For the production or antibody, varioushost animals can be immunized by injection with the CLN2 protein, or aderivative (e.g., fragment or fusion protein) thereof, including but notlimited to rabbits, mice, rats, sheep, goats, etc. In one embodiment,the CLN2 protein, or more preferably a fragment thereof, can beconjugated to an immunogenic carrier, e.g., bovine serum albumin (BSA)or keyhole limpet hemocyanin (KLH). Various adjuvants may be used toincrease the immunological response, depending on the host species,including but not limited to Freund's (complete and incomplete), mineralgels such as aluminum hydroxide, surface active substances such aslysolecithin, pluronic polyols, polyanions, peptides, oil emulsions,keyhole limpet hemocyanins, dinitrophenol, and potentially useful humanadjuvants such as BCG (bacille Calmette-Guerin) and Corynebacteriumparvum.

[0103] For preparation of monoclonal antibodies directed toward the CLN2protein, or fragment, analog, or derivative thereof, any technique thatprovides for the production of antibody molecules by continuous celllines in culture may be used. These include but are not limited to thehybridoma technique originally developed by Kohler and Milstein (1975,Nature 256:495-497), as well as the trioma technique, the human B-cellhybridoma technique (Kozbor et al., 1983, Immunology Today 4:72), andthe EBV-hybridoma technique to produce human monoclonal antibodies (Coleet al., 1985, in Monoclonal Antibodies and Cancer Therapy, Alan R. Liss,Inc., pp. 77-96). In an additional embodiment of the invention,monoclonal antibodies can be produced in germ-free animals utilizingrecent technology (PCT/US90/02545). According to the invention, humanantibodies may be used and can be obtained by using human hybridomas(Cote et al., 1983, Proc. Natl. Acad. Sci. U.S.A. 80:2026-2030) or bytransforming human B cells with EBV virus in vitro (Cole et al., 1985,in Monoclonal Antibodies and Cancer Therapy. Alan R. Liss. pp. 77-96).In fact, according to the invention, techniques developed for theproduction of “chimeric antibodies” (Morrison et al., 1984, J.Bacteriol. 159-870: Neuberger et al., 1984, Nature 312:604-608; Takedaet al., 1985, Nature 314:452-454) by splicing the genes from a mouseantibody molecule specific for a CLN2 protein together with genes from ahuman antibody molecule of appropriate biological activity can be used;such antibodies are within the scope of this invention. Such human orhumanized chimeric antibodies are preferred for use in therapy of humandiseases or disorders (described infra), since the human or humanizedantibodies are much less likely than xenogenic antibodies to induce animmune response, in particular an allergic response, themselves.

[0104] According to the invention, techniques described for theproduction of single chain antibodies (U.S. Pat. No. 4,946,778) can beadapted to produce CLN2 protein-specific single chain antibodies. Anadditional embodiment of the invention utilizes the techniques describedfor the construction of Fab expression libraries (Huse et al., 1989,Science 246:1275-1281) to allow rapid and easy identification ofmonoclonal Fab fragments with the desired specificity for a CLN2protein, or its derivatives, or analogs.

[0105] Antibody fragments which contain the idiotype of the antibodymolecule can be generated by known techniques. For example, suchfragments include but are not limited to: the F(ab′)₂ fragment which canbe produced by pepsin digestion of the antibody molecule; the Fab′fragments which can be generated by reducing the disulfide bridges ofthe F(ab′)₂ fragment, and the Fab fragments which can be generated bytreating the antibody molecule with papain and a reducing agent.

[0106] In the production of antibodies, screening for the desiredantibody can be accomplished by techniques known in the art, e.g.,radioimmunoassay, ELISA (enzyme-linked immunosorbant assay), “sandwich”immunoassays, immunoradiometric assays, gel diffusion precipitinreactions, immunodiffusion assays, in situ immunoassays (using colloidalgold, enzyme or radioisotope labels, for example), western blots,precipitation reactions, agglutination assays (e.g., gel agglutinationassays, hemagglutination assays), complement fixation assays,immunofluorescence assays, protein A assays, immunoelectrophoresisassays, or enzymatic assay for CLN2. etc. In one embodiment, antibodybinding is detected by detecting a label on the primary antibody. Inanother embodiment, the primary antibody is detected by detectingbinding of a secondary antibody or reagent to the primary antibody. In afurther embodiment, the secondary antibody is labeled. Many means areknown in the art for detecting binding in an immunoassay and are withinthe scope of the present invention. For example, to select antibodieswhich recognize a specific epitope of a CLN2 protein, one may assaygenerated hybridomas for a product which binds to a CLN2 proteinfragment containing such epitope. For selection of an antibody specificto a CLN2 protein from a particular species of animal, one can select onthe basis of positive binding with CLN2 protein expressed by or isolatedfrom cells of that species of animal.

[0107] According to the invention, the antibodies specific for CLN2 canbe labeled. Suitable labels include enzymes, fluorophores (e.g.,fluorescene isothiocyanate (FITC), phycoerythrin (PE), Texas red (TR),rhodamine, free or chelated lanthanide series salts, especially Eu³⁺, toname a few fluorophores), chromophores, radioisotopes, chelating agents,dyes, colloidal gold, latex particles, ligands (e.g., biotin), andchemiluminescent agents. When a control marker is employed, the same ordifferent labels may be used for the receptor and control marker.

[0108] In the instance where a radioactive label, such as the isotopes³H, ¹⁴C, ³²P, ³⁵S, ³⁶Cl, ⁵¹Cr, ⁵⁷Co, ⁵⁸Co, ⁵⁹Fe, ⁹⁰Y, ¹²⁵I, ¹³¹I, and¹⁸⁶Re are used, known currently available counting procedures may beutilized. In the instance where the label is an enzyme, detection may beaccomplished by any of the presently utilized calorimetric,spectrophotometric, fluorospectrophotometric, amperometric or gasometrictechniques known in the art.

[0109] Direct labels are one example of labels which can be usedaccording to the present invention. A direct label has been defined asan entity, which in its natural state, is readily visible, either to thenaked eye, or with the aid of an optical filter and/or appliedstimulation, e.g., U.V. light to promote fluorescence. Among examples ofcolored labels, which can be used according to the present invention,include metallic sol particles, for example, gold sol particles such asthose described by Leuvering (U.S. Pat. No. 4,313,734); dye soleparticles such as described by Gribnau et al. (U.S. Pat. No. 4,373,932)and May et al. (WO 88/08534); dyed latex such as described by May,supra, Snyder (EP-A 0 280 559 and 0 281 327); or dyes encapsulated inliposomes as described by Campbell et al. (U.S. Pat. No. 4,703,017).Other direct labels include a radionucleotide, a fluorescent moiety or aluminescent moiety. In addition to these direct labeling devices,indirect labels comprising enzymes can also be used according to thepresent invention. Various types of enzyme linked immunoassays are wellknown in the art, for example, alkaline phosphatase and horseradishperoxidase, lysozyme, glucose-6-phosphate dehydrogenase, lactatedehydrogenase, urease, these and others have been discussed in detail byEva Engvall in Enzyme Immunoassay ELISA and EMIT in Methods inEnzymology, 70, 419-439, 1980 and in U.S. Pat. No. 4,857,453.

[0110] Other labels for use in the invention include magnetic beads ormagnetic resonance imaging labels.

[0111] In another embodiment, a phosphorylation site can be created onan antibody of the invention for labeling with ³²P, e.g., as describedin European Patent No. 0372707 (application No. 89311108.8) by SidneyPestka, or U.S. Pat. No. 5,459,240, issued Oct. 17, 1995 to Foxwell etal.

[0112] As exemplified herein, proteins, including antibodies, can belabeled by metabolic labeling. Metabolic labeling occurs during in vitroincubation of the cells that express the protein in the presence ofculture medium supplemented with a metabolic label, such as[³⁵S]-methionine or [³²P]-orthophosphate. In addition to metabolic (orbiosynthetic) labeling with [³⁵S]-methionine, the invention furthercontemplates labeling with [¹⁴C]-amino acids and [³H]-amino acids (withthe tritium substituted at non-labile positions).

[0113] The foregoing antibodies can be used in methods known in the artrelating to the localization and activity of the CLN2 protein, e.g., forWestern blotting, imaging CLN2 protein in situ, measuring levels thereofin appropriate physiological samples, immunohistochemistry, etc.

[0114] In a specific embodiment, antibodies that agonize or antagonizethe activity of CLN2 protein, mutant variant associated with LINCL, canbe generated.

Detection of CLN2 and Implications thereof

[0115] According to the invention, the presence, amount, or activitylevel of CLN2 may be a useful prognostic for LINCL and useful tool forassessing the efficacy of LINCL therapeutic treatment. Accordingly, thepresent invention provides for assays detecting the presence, measuringthe amount, and/or quantitating the activity of CLN2 protein or, in theformer two cases, mRNA in sample. The diagnostic methods can be used todetect a CLN2 gene or mRNA, or CLN2 protein, in a biological sample froman individual. The biological sample can be a biological fluidcomprising cells, such as but not limited to, blood, interstitial fluid,plural effusions, urine, cerebrospinal fluid, and the like. Preferably,CLN2 is detected in blood, which is readily obtained. Alternatively,CLN2 can be detected from cellular sources, such as, but not limited to,tissue biopsies, brain, adipocytes, testes, heart, and the like. Forexample, cells can be obtained from an individual by biopsy and lysed.e.g., by freeze-thaw cycling, or treatment with a mild cytolyticdetergent such as, but not limited to, TRITON X-1OO, digitonin, NONIDETP (NP)-40, saponin, and the like, or combinations thereof (see, e.g.,International Patent Publication WO 92/08981, published May 29, 1992).In yet another embodiment, samples containing both cells and body fluidscan be used (see ibid.).

[0116] In another embodiment, a lower level or lack of CLN2 expressionin a sample LINCL-affected cell compared to a normal cell may beindicative of the LINCL disease. Thus, the invention contemplates amethod for detecting LINCL disease in a sample cell comprising detectingthe level of mammalian CLN2 in a cell with the LINCL phenotype, andcomparing the level of CLN2 detected with the level in a normal cell,wherein a lower level of CLN2 in the sample cell than in the normal cellindicates LINCL disease. The level of CLN2 can be detected by detectingmRNA or CLN2 protein, the latter by immunoassay or biochemistry, asdescribed infra. This method is not only of diagnostic value, but can beused to assess the efficacy of LINCL therapeutic treatment.

[0117] In yet another embodiment, the assay can be based on quantitatingCLN2 pepstatin-insensitive carboxyl protease activity. Again, thismethod is not only of diagnostic value, but can be used to assess theefficacy of LINCL therapeutic treatment.

[0118] In still yet another embodiment, a method is contemplated fordetecting the CLN2 gene, and mutant variants associated with LINCL, inchromosomal samples comprising of: contacting a chromosomal sample from,for example, amniotic fluid, with oligonucleotides complementary to CLN2or variant mutant alleles of CLN2, under conditions that allow forhybridization; and, detecting hybridization of the oligonucleotides tothe chromosomes in the sample. Such a method would prove invaluable as aprenatal screening test for LINCL.

[0119] The present invention includes an assay system which may beprepared in the form of a test kit for the quantitative analysis of theextent of the presence of the CLN2, or to identify drugs or other agentsthat may mimic or block their activity. The system or test kit maycomprise a labeled component, such as an antibody or oligonucleotidespecific for CLN2 protein or mRNA, respectively. Preferably, an assaykit of the invention also comprises a positive control reagent, eitherCLN2 protein or CLN2 mRNA, for confirming assay performance, and, ifdesired, for quantitation.

[0120] In one embodiment, the present invention provides for thedetection of expression of CLN2 or mRNA encoding CLN2. For example, anantisense oligonucleotide of the invention can be used in standardNorthern hybridization analysis to detect the presence, and in someinstances quantitate the level of expression, of CLN2 mRNA. Anoligonucleotide of the invention may also be used to detect mutations inthe CLN2 mRNA or gene, by high stringency hybridization analysis with amutant specific probe (or a wild-type specific probe) with detection ofhybridization or lack thereof indicating whether the gene is mutated.For example, hybridization of a wild-type specific probe indicates nomutation, and lack of hybridization indicates a mutation. The reversewould be true for a mutation-specific probe. The techniques forpreparing labeled oligonucleotides and using them to analyze geneexpression or mutations are well known in the art.

[0121] Alternatively, oligonucleotides of the invention can be used asPCR primers to amplify CLN2 mRNA (e.g., by reverse transcriptase-PCR),or CLN2 genes. The amplified mRNA can be quantified, or either amplifiedmRNA or genomic DNA can be analyzed for mutations. Mutations in theamplified DNA can be detected by creation or deletion of restrictionfragment length polymorphisms (RFLPs) not found in the native gene orcDNA, hybridization with a mutation specific probe (or lack ofhybridization with a wild-type specific probe), as well as by othertechniques.

[0122] The presence or level of CLN2 protein can be measured using byimmunoassay using an antibody of the invention. Various immunoassaytechniques are known in the art, e.g., as described in the “Antibody”section above. In a specific embodiment, infra, a rabbit polyclonalantiserum detects CLN2. In an immunoassay, an antibody may be introducedinto a biological sample. After the antibody has had an opportunity toreact with sites within the sample, the resulting product mass may beexamined by known techniques, which may vary, e.g., with the nature ofthe label attached.

[0123] Finally, biochemical or immunochemical/biochemical (e.g.,immunoprecipitation) techniques can be used to detect the presence andor level of CLN2. For example, in one embodiment, a cell may bemetabolically labeled (as described in the “Antibody” section, supra,and the Examples, infra), the cell lysed and analyzed by PAGE, and thepresence of a ⁻46 kDa band evaluated. Furthermore, the band can bequantitated by densitometry. Alternatives to metabolic labeling includeWestern analysis, silver staining, Coomassie blue staining, etc. Inanother embodiment, the presence and level of CLN2 activity can bedetected enzymatically, e.g., by testing the catalytic activity of acellular extract or isolated protein corresponding to CLN2.

Therapeutic Aspects of CLN2

[0124] Based on the data developed in the Examples, infra, particularlythe observation that absence of CLN2 or presence of a mutated variant ofCLN2 is associated with LINCL, CLN2 may be employed as a therapeutic toameliorate LINCL. Thus, according to the invention,CLN2, or anexpression vector encoding CLN2, can be administered to a subject inneed of treatment for LINCL in order to agonize CLN2 activity and thusameliorate LINCL. The methods of administration described herein can beemployed to agonize or antagonize CLN2 activity.

[0125] Various mechanisms are available for increasing CLN2 activity incells, e.g., direct administration of a construct (chimeric or viachemical derivitization or crosslinking) of CLN2 with a targetingmolecule (e.g., transferrin, a hormone, a growth factor, or a targetcell-specific antibody) to a subject in need of treatment, or by genetherapy approaches to increase expression of CLN2 in proliferating cellsin situ.

[0126] A subject in whom administration of CLN2 is an effectivetherapeutic regimen for LINCL is preferably a human, but can be anyanimal. Thus, as can be readily appreciated by one of ordinary skill inthe art, the methods and pharmaceutical compositions of the presentinvention are particularly suited to administration to any animal,particularly a mammal, including, but by no means limited to, domesticanimals, such as feline or canine subjects, farm animals, such as butnot limited to bovine, equine, caprine, ovine, and porcine subjects,wild animals (whether in the wild or in a zoological garden), researchanimals, such as mice, rats, rabbits, goats, sheep, pigs, dogs, cats,etc., avian species, such as chickens, turkeys, songbirds, etc., i.e.,for veterinary medical use.

[0127] Preferably, a composition of the invention for treatment of LINCLis provided in a pharmaceutically acceptable carrier or excipient. Thephrase “pharmaceutically acceptable” refers to molecular entities andcompositions that are physiologically tolerable and do not typicallyproduce an allergic or similar untoward reaction, such as gastric upset,dizziness and the like, when administered to a human. Preferably, asused herein, the term “pharmaceutically acceptable” means approved by aregulatory agency of the Federal or a state government or listed in theU.S. Pharmacopeia or other generally recognized pharmacopeia for use inanimals, and more particularly in humans, although a pharmaceuticallyacceptable carrier of the invention may share the attributes of such anapproved carrier without itself having been approved. The term “carrier”refers to a diluent, adjuvant, excipient, or vehicle with which thecompound is administered. Such pharmaceutical carriers can be sterileliquids, such as water and oils, including those of petroleum, animal,vegetable or synthetic origin, such as peanut oil, soybean oil, mineraloil, sesame oil and the like. Water or aqueous solution saline solutionsand aqueous dextrose and glycerol solutions are preferably employed ascarriers, particularly for injectable solutions. Suitable pharmaceuticalcarriers are described in “Remington's Pharmaceutical Sciences” by E. W.Martin.

[0128] The phrase “therapeutically effective amount” is used herein tomean an amount sufficient to reduce by at least about 15 percent,preferably by at least 50 percent, more preferably by at least 90percent, and most preferably prevent, a clinically significant deficitin the activity, function and response of the host. Alternatively, atherapeutically effective amount is sufficient to cause an improvementin a clinically significant condition in the host. According to theinvention, where amelioration of LINCL is sought, a therapeuticallyeffective amount of a pharmaceutical composition of the invention willrestore pepstatin-insensitive carboxyl protease activity to levels thatameliorate LINCL. A therapeutically effective amount and treatmentregimen can be developed for an individual by an ordinary skilledphysician, taking into account the age, sex, size, and physical wellbeing, of the patient; the course and extent of the disease or disorder;previous, concurrent, or subsequent treatment regimens and the potentialfor drug interactions; all of which parameters are routinely consideredby a physician in prescribing administration of a pharmaceutical agent.

[0129] The instant invention provides for conjugating targetingmolecules to CLN2, DNA vectors (including viruses) encoding CLN2, andcarriers (i.e., liposomes) for targeting to a desired cell or tissue,e.g., a tumor. “Targeting molecule” as used herein shall mean a moleculewhich, when administered in vivo, localizes to desired location(s).

[0130] In various embodiments, the targeting molecule can be a peptideor protein, antibody, lectin, carbohydrate, or steroid. In oneembodiment, the targeting molecule is a protein or peptide ligand of aninternalized receptor on the target cell.

[0131] In a specific embodiment, the targeting molecule is a peptidecomprising the well known RGD sequence, or variants thereof that bindRGD receptors on the surface of cells such as cancer cells, e.g., humanova that have receptors that recognize the RGD sequence. Other ligandsinclude, but are not limited to, transferrin, insulin, amylin, and thelike. Receptor internalization is preferred to facilitate intracellulardelivery of CLN2 protein.

[0132] In another embodiment, the targeting molecule is an antibody.Preferably, the targeting molecule is a monoclonal antibody. In oneembodiment, to facilitate crosslinking the antibody can be reduced totwo heavy and light chain heterodimers, or the F(ab′)₂ fragment can bereduced, and crosslinked to the CLN2 via the reduced sulfhydryl.

[0133] Antibodies for use as targeting molecule are specific for cellsurface antigen. In one embodiment, the antigen is a receptor. Forexample, an antibody specific for a receptor on cancer cells, such asmelanoma cells, can be used.

[0134] This invention further provides for the use of other targetingmolecules, such as lectins, carbohydrates, proteins and steroids.

Administration of Targeted CLN2

[0135] According to the invention, a therapeutic composition comprisingdelivery of the invention may be introduced parenterally,transmucosally, e.g., orally, nasally, or rectally, or transdermally.Preferably, administration is parenteral, e.g., via intravenousinjection, and also including, but is not limited to, intra-arteriole,intramuscular, intradermal, subcutaneous, intraperitoneal,intraventricular, and intracranial administration.

[0136] In another embodiment, the therapeutic compound can be deliveredin a vesicle, in particular a liposome (see Langer, Science249:1527-1533 (1990); Treat et al., in Liposomes in the Therapy ofInfectious Disease and Cancer, Lopez-Berestein and Fidler (eds.), Liss,N.Y., pp. 353-365 (1989); Lopez-Berestein, ibid., pp. 317-327; seegenerally ibid). To reduce its systemic side effects and increasecellular penetration, this may be a preferred method for introducingCLN2.

[0137] In yet another embodiment, the therapeutic compound can bedelivered in a controlled release system. For example, the polypeptidemay be administered using intravenous infusion, an implantable osmoticpump, a transdermal patch, liposomes, or other modes of administration.In one embodiment, a pump may be used (see Langer, supra; Sefton, CRCCrit. Ref. Biomed. Eng. 14:201 (1987); Buchwald et al., Surgery 88:507(1980); Saudek et al., N. Engl. J. Med. 321:574 (1989)). In anotherembodiment, polymeric materials can be used (see Medical Applications ofControlled Release, Langer and Wise (eds.), CRC Pres., Boca Raton, Fla.(1974); Controlled Drug Bioavailability, Drug Product Design andPerformance, Smolen and Ball (eds.), Wiley, New York (1984); Ranger andPeppas, J. Macromol. Sci. Rev. Macromol. Chem. 23:61 (1983); see alsoLevy et al., Science 228:190 (1985); During et al., Ann. Neurol. 25:351(1989); Howard et al., J. Neurosurg. 71:105 (1989)). In yet anotherembodiment, a controlled release system can be placed in proximity ofthe therapeutic target, thus requiring only a fraction of the systemicdose (see, e.g., Goodson, in Medical Applications of Controlled Release,supra, vol. 2, pp. 115-138 (1984)). Preferably, a controlled releasedevice is introduced into a subject in proximity of the siteLINCL-affected tissue.

[0138] Other controlled release systems are discussed in the review byLanger (Science 249:1527-1533 (1990)).

Gene Therapy

[0139] In one embodiment, a gene encoding an CLN2 protein or polypeptidedomain fragment thereof is introduced in vivo or ex vivo in a nucleicacid vector. Viral vectors commonly used for in vivo or ex vivotargeting and therapy procedures are DNA-based vectors and retroviralvectors. Methods for constructing and using viral vectors are known inthe art (see, e.g., Miller and Rosman, BioTechniques 7:980-990 (1992)).DNA vectors include an attenuated or defective DNA virus, such as butnot limited to herpes simplex virus (HSV), papillomavirus, Epstein Barrvirus (EBV), adenovirus, adeno-associated virus (AAV), and the like.Defective viruses, which entirely or almost entirely lack viral genes,are preferred. Defective virus is not infective after introduction intoa cell. Use of defective viral vectors allows for administration tocells in a specific, localized area, without concern that the vector caninfect other cells. Thus, tumor tissue can be specifically targeted.Examples of particular vectors include, but are not limited to, adefective herpes virus 1 (HSV1) vector (Kaplitt et al., 1991, Molec.Cell. Neurosci. 2:320-330), an attenuated adenovirus vector, such as thevector described by Stratford-Perricaudet et al. (1992, J. Clin. Invest.90:626-630), and a defective adeno-associated virus vector (Samulski etal., 1987, J. Virol. 61:3096-3101; Samulski et al., 1989, J. Virol.63:3822-3828).

[0140] Preferably, for in vivo administration, an appropriateimmunosuppressive treatment is employed in conjunction with the viralvector, e.g., adenovirus vector, to avoid immuno-deactivation of theviral vector and transfected cells. For example, immunosuppressivecytokines, such as interleukin-12 (IL-12), interferon-γ (IFN-γ), oranti-CD4 antibody, can be administered to block humoral or cellularimmune responses to the viral vectors (see, e.g., Wilson, NatureMedicine (1995)). In addition, it is advantageous to employ a viralvector that is engineered to express a minimal number of antigens.

[0141] In another embodiment the gene can be introduced in a retroviralvector, e.g., as described in Anderson et al., U.S. Pat. No. 5,399,346;Mann et al., 1983, Cell 33:153; Temin et al., U.S. Pat. No. 4,650,764;Temin et al., U.S. Pat. No. 4,980,289; Markowitz et al., 1988, J. Virol.62:1120; Temin et al., U.S. Pat. No. 5,124,263; International PatentPublication No. WO 95/07358, published Mar. 16, 1995, by Dougherty etal.; and Kuo et al., 1993, Blood 82:845.

[0142] Targeted gene delivery is described in International PatentPublication WO 95/28494, published October 1995.

[0143] Alternatively, the vector can be introduced in vivo bylipofection. For the past decade, there has been increasing use ofliposomes for encapsulation and transfection of nucleic acids in vitro.Synthetic cationic lipids designed to limit the difficulties and dangersencountered with liposome mediated transfection can be used to prepareliposomes for in vivo transfection of a gene encoding a marker (Feigner,et. al., 1987, Proc. Natl. Acad. Sci. U.S.A. 84:7413-7417; see Mackey,et al., 1988, Proc. Natl. Acad. Sci. U.S.A. 85:8027-8031)). The use ofcationic lipids may promote encapsulation of negatively charged nucleicacids, and also promote fusion with negatively charged cell membranes(Felgner and Ringold, 1989, Science 337:387-388). The use of lipofectionto introduce exogenous genes into the specific organs in vivo hascertain practical advantages. Molecular targeting of liposomes tospecific cells represents one area of benefit. It is clear thatdirecting transfection to particular cell types would be particularlyadvantageous in a tissue with cellular heterogeneity, such as pancreas,liver, kidney, and the brain. Lipids may be chemically coupled to othermolecules for the purpose of targeting (see Mackey, et. al., 1988,supra). Targeted peptides, e.g., hormones or neurotransmitters, andproteins such as antibodies, or non-peptide molecules could be coupledto liposomes chemically.

[0144] It is also possible to introduce the vector in vivo as a nakedDNA plasmid. Naked DNA vectors for gene therapy can be introduced intothe desired host cells by methods known in the art, e.g., transfection,electroporation, microinjection, transduction, cell fusion, DEAEdextran, calcium phosphate precipitation, biolistics (use of a genegun), or use of a DNA vector transporter (see, e.g., Wu et al., 1992, J.Biol. Chem. 267:963-967; Wu and Wu, 1988, J. Biol. Chem.263:14621-14624; Hartmut et at., Canadian Patent Application No.2,012,311, filed Mar. 15, 1990).

[0145] The present invention may be better understood by reference tothe following Examples, which are provided by way of exemplification andare in no way limiting.

EXAMPLE 1

[0146] Isolation and identification of CLN2 and its corresponding geneproduct. Since LINCL results from the absence or deficiency of alysosomal enzyme, then its corresponding Man 6-phosphorylated formshould also be absent or decreased. To test this possibility, detergentsoluble extracts of autopsy brain samples from a LINCL patient and anormal control were fractionated by 2D gel electrophoresis and Man 6-Pglycoproteins detected after transfer to nitrocellulose using aniodinated fragment of the MPR (9) (FIG. 1). Normal brain contains ^(˜)75distinct spots representing multiple isoforms of different Man 6-Pcontaining glycoproteins (FIG. 1, top). LINCL brain is remarkablysimilar, except one prominent spot is absent (FIG. 1, bottom). Thecorresponding normal spot has an apparent MW of 46,000 Da and anisoelectric point centered at pH ^(˜)6.0. Extracts from 4 LINCL patientswere also compared with 3 normal controls by one dimensional SDS-PAGE,with the consistent observation that this major Man 6-phosphorylatedglycoprotein in the healthy extracts was absent in the LINCL brain (datanot shown).

[0147] In order to identify this potential candidate for CLN2, total Man6-P containing glycoproteins were purified (10,11) from normal brain byaffinity chromatography on a column of immobilized MPR and, afterfractionation by SDS-PAGE and transfer to a PVDF membrane, the band thatwas absent in the LINCL specimens was isolated and sequenced. Thissequence was compared against the SWISSPROT database and against thepredicted translation products from the GENBANK database using BLASTPand tBLASTN, respectively. No significant sequence homologies wereobserved, revealing it to be a novel Man 6-P glycoprotein, and thuspresumably a previously uncharacterized human lysosomal enzyme. TheN-terminal sequence was then compared with predicted translationproducts from the expressed sequence tag (EST) database (dbEST) usingTBLASTN. The initial search of the database detected a murine cloneencoding a sequence identical to the peptide in 16 of 20 positions andlater releases of dbEST contained human clones identical to the peptidein 19 of 20 positions. By iterative database searching and sequencingselect clones¹, a nearly full length sequence for the human CLN2candidate was assembled (FIG. 3). The 5′ end of the human cDNA wasobtained by two rounds of polymerase chain amplification of the CLN2candidate from a human cortex cDNA library (Stratagene) using twodifferent gene specific primers and a single vector-specific primer².The composite sequence of the CLN2 candidate (FIG. 3) was subsequentlyconfirmed from a genomic clone and amplified segments of genomic DNAfrom LINCL patients and normal controls.

EXAMPLE 2

[0148] Characterization of CLN2 and its corresponding gene product. Thelocation of polyA tracts on different human EST cDNA clones indicatesthat there are two transcripts, with the polyA tail starting after nt2503 for the short transcript and nt 3487 for the long transcript. (FIG.3). This is confirmed by northern blot analysis, which reveals twotranscripts of ^(˜)2700 and 3700 nt (FIG. 2). mRNA was detected in alltissues examined (in addition to those tissues shown in FIG. 2, spleen,thymus, prostate, testis, ovary, small intestine, colon and peripheralblood leukocytes also expressed mRNA (not shown)) but levels werehighest in heart and placenta and relatively similar in other tissues.The ubiquitous distribution of this mRNA indicated by Northern blottingis confirmed by the existence of highly related clones in many differentcDNA libraries as found by database searches.

[0149] The CLN2 message long open reading frame encodes a 563-residueprotein that is predicted to contain a 16-residue signal sequence (FIG.3). There are no methionines between the putative initiation codon andthe start of the chemically determined sequence at residue 195,indicating that the CLN2 precursor contains a long pro-region orconsists of a N-terminal light and a C-terminal heavy chain. As all fivepotential glycosylation sites reside C-terminal to the cleavage site,should a light chain be present in the mature protein, it would not havebeen detected using the Man 6-P glycoprotein assay.

[0150] The predicted physical properties of the conceptually translatedprotein are in accordance with the observed properties of the proteinthat is missing in LINCL brain extracts, which has an apparent MW of46,000 Da and a pI of 6.0. The calculated MW of the mature protein/heavychain is 39,700 Da. Assuming all glycosylation sites are utilized and anaverage MW of 1800 Da for each oligosaccharide, the total MW would be^(˜)48,000 Da. The calculated isoelectric point is 6.13 withoutconsidering post-translational modifications e.g., Man 6-P residues,which would shift the isoelectric point towards the acidic range.

[0151] The absence of this 46 kDa lysosomal protein in LINCL patientsmakes it a likely candidate for CLN2. Strong support for this conclusioncomes from the observation that the gene identified here maps tochromosome 11p15³, which is also the locus identified for CLN2 bygenetic linkage analysis (3).

[0152] Direct evidence for the identification of CLN2 came from sequenceanalysis of DNA from LINCL patients and unaffected family members (Table1). The gene structure (not shown) of the CLN2 candidate was determinedby sequence comparison between PCR segments from a genomic clone and thecDNA sequence. This allowed analysis of both intronic and exonicsequences from LINCL patient DNA using genomic DNA prepared from celllines4. Mutations were observed in two of the PCR segments generatedfrom the DNA of LINCL patients. Two unrelated LINCL patients containedmutations within the codon (TGT) encoding Cys 365. In one case, amonoallelic transversion of T to C resulted in a Cys to Argsubstitution; presumably the defect in this patient is compoundheterozygous and there is therefore an additional as yet unidentifiedmutant allele. Providing evidence that this substitution represents adeleterious mutation rather than a polymorphism is the observation thatanother patient contains a different mutation in the same codon. In thiscase, a homozygous G to A transversion resulted in a Cys to Tyrsubstitution in the protein expressed from both alleles. Should this Cysprove to be involved in disulfide bonding, mutations are likely to behighly disruptive given the role of disulfide bonds in establishing andmaintaining protein structure. Different compound heterozygous mutationswere found in two affected siblings. A heterozygous C to T transversionresulted in the conversion of the codon (CGA) for Arg 208 to an umber(TGA) stop codon. In the other allele, the conserved AG of the intronic3′ splice junction sequence is mutated to AC which is likely to resultin incorrect splicing of the CLN2 candidate mRNA. Each parent possesseda single different mutant allele and an unaffected sibling possessedonly the premature stop mutation, indicating conventional Mendelianinheritance of these mutations. None of these mutations were observed inthe genomic clone, placental DNA from a normal subject or in any of theEST sequences which overlap these sites. When considered in conjunctionwith the chromosomal localization of this protein, the presence of thesemutations unequivocally demonstrate that the protein identified here isCLN2. TABLE 1 Gen type Analysis f LINCL Patients. MUTATION† C636T T1107CG1108A cell line* splice junction^(‡) Arg208Stop Cys365Arg Cys365TyrC7786 unaffected sibling +/+ −/+ +/+ +/+ C7787 PROBAND −/+ −/+ +/+ +/+C7788 PROBAND −/+ −/+ +/+ +/+ C7789 mother +/+ −/+ +/+ +/+ C7790 father−/+ +/+ +/+ +/+ WG305 +/+ +/+ +/+ −/− WG308 +/+ +/+ −/+ +/+ # firstcousins providing a likely explanation for the homozygosity of theobserved mutation.

[0153] It is likely that the CLN2 protein represents a previouslyunidentified type of lysosomal protease. Sequence comparisons revealedsignificant similarities⁵ between the CLN2 candidate with carboxylpeptidases from Pseudomonas (13) (PsCP) (17) and Xanthomonas (14) (XaCP)(18). Multiple alignments between the CLN2 candidate and the twobacterial proteases reveal significant blocks of sequence similaritiesand both PsCP and XaCP have long propieces, with mature amino terminilocated proximal to the known amino terminus of the mature/heavy chainCLN2 candidate (FIG. 4, upper panel). PsCP and XaCP are highly unusualcarboxyl proteinases that are not inhibited by pepstatin, the classicalinhibitor of pepsin, cathepsin D, and other aspartyl proteases.

[0154] Analysis of brain autopsy specimens indicate that normal braincontains an acid protease activity not inhibited by pepstatin and E64,while this activity is essentially absent from CLN2 brains (FIG. 4,lower panel). Pepstatin-insensitive carboxyl proteases have not, todate, been reported to exist in mammals, and would thus have beenoverlooked in earlier biochemical studies of lysosomal activities inLINCL patients. One characteristic of LINCL is the storage ofmitochondrial ATP synthase subunit c in the lysosomes of patients (19,20, 21) which may indicate that subunit c represents a substrate for theCLN2 protein. Also, while the prominent neurological component of LINCLmay be due to the susceptibility of neurons to metabolic insults, oneintriguing possibility is that the CLN2 protein is involved inprocessing of neuron-specific trophic factors.

References

[0155] 1. R.-M. Boustany, Neurodystrophies and Neurolipidoses. H. W.Moser, Ed., Handbook of Clinical Neurology (Elsevier Science, Amsterdam,1996), vol. 22(66), pp. 671-700.

[0156] 2. J. A. Rider, G. Dawson, A. N. Siakotos, American Journal ofMedical Genetics 42, 519-24 (1992).

[0157] 3. J. D. Sharp, et al., Human Molecular Genetics 6, 591-5 (1997).

[0158] 4. J. Vesa, et al., Nature 376, 584-7 (1995).

[0159] 5. T. I. B. D. Consortium, Cell 82, 949-57 (1995).

[0160] 6. G. O. Ivy, F. Schottler, J. Wenzel, M. Baudry, G. Lynch,Science 226, 985-7 (1984).

[0161] 7. G. O. Ivy, American Journal of Medical Genetics 42, 555-60(1992).

[0162] 8. S. Kornfeld, W. S. Sly, The Metabolic and molecular bases ofinherited disease. C. R. Sciver, A. L. Beaudet, W. S. Sly, D. Valle,Eds. (McGraw-Hill, Inc., New York. 1995), vol. II, pp.2495-2508.

[0163] 9. D. E. Sleat, I. Sohar, H. Lackland, J. Majercak, P. Lobel,Journal of Biological Chemistry 271, 19191-8 (1996).

[0164] 10. D. E. Sleat, S. R. Kraus, I. Sohar, H. Lackland, P. Lobel,Biochemical Journal 324, 33-39 (1997).

[0165] 11. K. J. Valenzano, L. M. Kallay, P. Lobel, AnalyticalBiochemistry 209, 156-62 (1993).

[0166] 17. K. Oda, T. Takahashi, Y. Tokuda, Y. Shibano, S. Takahashi,Journal of Biological Chemistry 269, 26518-24 (1994).

[0167] 18. K. Oda, et al., Journal of Biochemistry 120, 564-72 (1996).

[0168] 19. D. N. Palmer, I. M. Fearnley, S. M. Medd, American Journal ofMedical Genetics 42, 561-567 (1992).

[0169] 20. D. N. Palmer, et al., Journal of Biological Chemistry 264,5736-40 (1989).

[0170] 21. J. Ezaki, L. S. Wolfe, E. Kominami, Journal of Neurochemistry67, 1677-1687 (1996).

[0171] The present invention is not to be limited in scope by thespecific embodiments describe herein. Indeed, various modifications ofthe invention in addition to those described herein will become apparentto those skilled in the art from the foregoing description and theaccompanying figures. Such modifications are intended to fall within thescope of the appended claims.

[0172] It is further to be understood that all base sizes or amino acidsizes, and all molecular weight or molecular mass values, given fornucleic acids or polypeptides are approximate, and are provided fordescription.

[0173] Various publications are cited herein, the disclosures of whichare incorporated by reference in their entireties.

1 12 3487 base pairs nucleic acid double linear cDNA NO 1 CGCGGAAGGGCAGAATGGGA CTCCAAGCCT GCCTCCTAGG GCTCTTTGCC CTCATCCTCT 60 CTGGCAAATGCAGTTACAGC CCGGAGCCCG ACCAGCGGAG GACGCTGCCC CCAGGCTGGG 120 TGTCCCTGGGCCGTGCGGAC CCTGAGGAAG AGCTGAGTCT CACCTTTGCC CTGAGACAGC 180 AGAATGTGGAAAGACTCTCG GAGCTGGTGC AGGCTGTGTC GGATCCCAGC TCTCCTCAAT 240 ACGGAAAATACCTGACCCTA GAGAATGTGG CTGATCTGGT GAGGCCATCC CCACTGACCC 300 TCCACACGGTGCAAAAATGG CTCTTGGCAG CCGGAGCCCA GAAGTGCCAT TCTGTGATCA 360 CACAGGACTTTCTGACTTGC TGGCTGAGCA TCCGACAAGC AGAGCTGCTG CTCCCTGGGG 420 CTGAGTTTCATCACTATGTG GGAGGACCTA CGGAAACCCA TGTTGTAAGG TCCCCACATC 480 CCTACCAGCTTCCACAGGCC TTGGCCCCCC ATGTGGACTT TGTGGGGGGA CTGCACCATT 540 TTCCCCCAACATCATCCCTG AGGCAACGTC CTGAGCCGCA GGTGACAGGG ACTGTAGGCC 600 TGCATCTGGGGGTAACCCCC TCTGTGATCC GTAAGCGATA CAACTTGACC TCACAAGACG 660 TGGGCTCTGGCACCAGCAAT AACAGCCAAG CCTGTGCCCA GTTCCTGGAG CAGTATTTCC 720 ATGACTCAGACCTGGCTCAG TTCATGCGCC TCTTCGGTGG CAACTTTGCA CATCAGGCAT 780 CAGTAGCCCGTGTGGTTGGA CAACAGGGCC GGGGCCGGGC CGGGATTGAG GCCAGTCTAG 840 ATGTGCAGTACCTGATGAGT GCTGGTGCCA ACATCTCCAC CTGGGTCTAC AGTAGCCCTG 900 GCCGGCATGAGGGACAGGAG CCCTTCCTGC AGTGGCTCAT GCTGCTCAGT AATGAGTCAG 960 CCCTGCCACATGTGCATACT GTGAGCTATG GAGATGATGA GGACTCCCTC AGCAGCGCCT 1020 ACATCCAGCGGGTCAACACT GAGCTCATGA AGGCTGCTGC TCGGGGTCTC ACCCTGCTCT 1080 TCGCCTCAGGTGACAGTGGG GCCGGGTGTT GGTCTGTCTC TGGAAGACAC CAGTTCCGCC 1140 CTACCTTCCCTGCCTCCAGC CCCTATGTCA CCACAGTGGG AGGCACATCC TTCCAGGAAC 1200 CTTTCCTCATCACAAATGAA ATTGTTGACT ATATCAGTGG TGGTGGCTTC AGCAATGTGT 1260 TCCCACGGCCTTCATACCAG GAGGAAGCTG TAACGAAGTT CCTGAGCTCT AGCCCCCACC 1320 TGCCACCATCCAGTTACTTC AATGCCAGTG GCCGTGCCTA CCCAGATGTG GCTGCACTTT 1380 CTGATGGCTACTGGGTGGTC AGCAACAGAG TGCCCATTCC ATGGGTGTCC GGAACCTCGG 1440 CCTCTACTCCAGTGTTTGGG GGGATCCTAT CCTTGATCAA TGAGCACAGG ATCCTTAGTG 1500 GCCGCCCCCCTCTTGGCTTT CTCAACCCAA GGCTCTACCA GCAGCATGGG GCAGGACTCT 1560 TTGATGTAACCCGTGGCTGC CATGAGTCCT GTCTGGATGA AGAGGTAGAG GGCCAGGGTT 1620 TCTGCTCTGGTCCTGGCTGG GATCCTGTAA CAGGCTGGGG AACACCCAAC TTCCCAGCTT 1680 TGCTGAAGACTCTACTCAAC CCCTGACCCT TTCCTATCAG GAGAGATGGC TTGTCCCCTG 1740 CCCTGAAGCTGGCAGTTCAG TCCCTTATTC TGCCCTGTTG GAAGCCCTGC TGAACCCTCA 1800 ACTATTGACTGCTGCAGACA GCTTATCTCC CTAACCCTGA AATGCTGTGA GCTTGACTTG 1860 ACTCCCAACCCTACCATGCT CCATCATACT CAGGTCTCCC TACTCCTGCC TTAGATTCCT 1920 CAATAAGATGCTGTAACTAG CATTTTTTGA ATGCCTCTCC CTCCGCATCT CATCTTTCTC 1980 TTTTCAATCAGGCTTTTCCA AAGGGTTGTA TACAGACTCT GTGCACTATT TCACTTGATA 2040 TTCATTCCCCAATTCACTGC AAGGAGACCT CTACTGTCAC CGTTTACTCT TTCCTACCCT 2100 GACATCCAGAAACAATGGCC TCCAGTGCAT ACTTCTCAAT CTTTGCTTTA TGGCCTTTCC 2160 ATCATAGTTGCCCACTCCCT CTCCTTACTT AGCTTCCAGG TCTTAACTTC TCTGACTACT 2220 CTTGTCTTCCTCTCTCATCA ATTTCTGCTT CTTCATGGAA TGCTGACCTT CATTGCTCCA 2280 TTTGTAGATTTTTGCTCTTC TCAGTTTACT CATTGTCCCC TGGAACAAAT CACTGACATC 2340 TACAACCATTACCATCTCAC TAAATAAGAC TTTCTATCCA ATAATGATTG ATACCTCAAA 2400 TGTAAGATGCGTGATACTCA ACATTTCATC GTCCACCTTC CCAACCCCAA ACAATTCCAT 2460 CTCGTTTCTTCTTGGTAAAT GATGCTATGC TTTTTCCAAC CAAGCCAGAA ACCTGTGTCA 2520 TCTTTTCACCCCACCTTCAA TCAACAAGTC CTCAATCAAC AAGTCCTACT GACTGCACAT 2580 CTTAAATATATCTTTATCAG TCCACAAGTC CTTCCAATTA TATTTCCCAA GTATATCTAG 2640 AACTTATCCACTTATATCCC CACTGCTACT ACCTTAGTTT AGGGCTATAT TCTCTTGAAA 2700 AAAAGTGTCCTTACTTCCTG CCAATCCCCA AGTCATCTTC CAGAGTAAAA TGCAAATCCC 2760 ATCAGGCCACTTGGATGAAA ACCCTTCAAG GATTACTGGA TAGAATTCAG GCTTTCCCCT 2820 CCASCCCCCAATCATAGCTC ACAAACCTTC CTTGCTATTT GTTCTTAAGT AAAAAATCAT 2880 TTTTCCTCCTCCCTCCCCAA ACCCCAAGGA ACTCTCACTC TTGCTCAAGC TGTTCCGTCC 2940 CCTTACCACCCCTGATACAA CTGCCAGGTT AATTTCCAGA ATTCTTGCAA GACTCAGTTC 3000 AGAAGTCACCTTCTTTCGTG AATGTTTTGA TTCCCTGAGG CTACTTTATT TTGGTATGGC 3060 TGAAAAATCCTAGATTTTCT AAACAAAACC TGTTTGAATC TTGGTTCTGA TATGGACTAG 3120 GAGAGAGACTGGGTCAAGTA AGCTTATCTC CCTGAGGCTG TTTCCTCGTC TGTTAAGTGT 3180 GAATATCAATACCTGCCTTT CATAATCACC AGGGAATAAA GTGGAATAAT GTTGATAACA 3240 GTGCTTGGCACCTGGAAGTA GGTGGCAGAT GTTAACGCCC TTCCTCCCTT GCACTGCGCC 3300 CCCTGTGCCTACCTCTAGCA TTGTAACGAC CACATAGTAT TGAAATGGCC AGTTTACTTG 3360 TCTGCCTTCCTTTCCAAGAC CGTTGGTGCC TAGAGGACTA GAATCGTGTC CTATTTAACT 3420 TTGTGTTCCCAGGTCCTAGC TCAGGAGTTG GCAAATAAGA ATTAAATGTC TGCTACACCG 3480 AAACAAA 34872520 base pairs nucleic acid double linear cDNA NO 2 CGCGGAAGGGCAGAATGGGA CTCCAAGCCT GCCTCCTAGG GCTCTTTGCC CTCATCCTCT 60 CTGGCAAATGCAGTTACAGC CCGGAGCCCG ACCAGCGGAG GACGCTGCCC CCAGGCTGGG 120 TGTCCCTGGGCCGTGCGGAC CCTGAGGAAG AGCTGAGTCT CACCTTTGCC CTGAGACAGC 180 AGAATGTGGAAAGACTCTCG GAGCTGGTGC AGGCTGTGTC GGATCCCAGC TCTCCTCAAT 240 ACGGAAAATACCTGACCCTA GAGAATGTGG CTGATCTGGT GAGGCCATCC CCACTGACCC 300 TCCACACGGTGCAAAAATGG CTCTTGGCAG CCGGAGCCCA GAAGTGCCAT TCTGTGATCA 360 CACAGGACTTTCTGACTTGC TGGCTGAGCA TCCGACAAGC AGAGCTGCTG CTCCCTGGGG 420 CTGAGTTTCATCACTATGTG GGAGGACCTA CGGAAACCCA TGTTGTAAGG TCCCCACATC 480 CCTACCAGCTTCCACAGGCC TTGGCCCCCC ATGTGGACTT TGTGGGGGGA CTGCACCATT 540 TTCCCCCAACATCATCCCTG AGGCAACGTC CTGAGCCGCA GGTGACAGGG ACTGTAGGCC 600 TGCATCTGGGGGTAACCCCC TCTGTGATCC GTAAGCGATA CAACTTGACC TCACAAGACG 660 TGGGCTCTGGCACCAGCAAT AACAGCCAAG CCTGTGCCCA GTTCCTGGAG CAGTATTTCC 720 ATGACTCAGACCTGGCTCAG TTCATGCGCC TCTTCGGTGG CAACTTTGCA CATCAGGCAT 780 CAGTAGCCCGTGTGGTTGGA CAACAGGGCC GGGGCCGGGC CGGGATTGAG GCCAGTCTAG 840 ATGTGCAGTACCTGATGAGT GCTGGTGCCA ACATCTCCAC CTGGGTCTAC AGTAGCCCTG 900 GCCGGCATGAGGGACAGGAG CCCTTCCTGC AGTGGCTCAT GCTGCTCAGT AATGAGTCAT 960 CCCTGCCACATGTGCATACT GTGAGCTATG GAGATGATGA GGACTCCCTC AGCAGCGCCT 1020 ACATCCAGCGGGTCAACACT GAGCTCATGA AGGCTGCTGC TCGGGGTCTC ACCCTGCTCT 1080 TCGCCTCAGGTGACAGTGGG GCCGGGTGTT GGTCTGTCTC TGGAAGACAC CAGTTCCGCC 1140 CTACCTTCCCTGCCTCCAGC CCCTATGTCA CCACAGTGGG AGGCACATCC TTCCAGGAAC 1200 CTTTCCTCATCACAAATGAA ATTGTTGACT ATATCAGTGG TGGTGGCTTC AGCAATGTGT 1260 TCCCACGGCCTTCATACCAG GAGGAAGCTG TAACGAAGTT CCTGAGCTCT AGCCCCCACC 1320 TGCCACCATCCAGTTACTTC AATGCCAGTG GCCGTGCCTA CCCAGATGTG GCTGCACTTT 1380 CTGATGGCTACTGGGTGGTC AGCAACAGAG TGCCCATTCC ATGGGTGTCC GGAACCTCGG 1440 CCTCTACTCCAGTGTTTGGG GGGATCCTAT CCTTGATCAA TGAGCACAGG ATCCTTAGTG 1500 GCCGCCCCCCTCTTGGCTTT CTCAACCCAA GGCTCTACCA GCAGCATGGG GCAGGACTCT 1560 TTGATGTAACCCGTGGCTGC CATGAGTCCT GTCTGGATGA AGAGGTAGAG GGCCAGGGTT 1620 TCTGCTCTGGTCCTGGCTGG GATCCTGTAA CAGGCTGGGG AACACCCAAC TTCCCAGCTT 1680 TGCTGAAGACTCTACTCAAC CCCTGACCCT TTCCTATCAG GAGAGATGGC TTGTCCCCTG 1740 CCCTGAAGCTGGCAGTTCAG TCCCTTATTC TGCCCTGTTG GAAGCCCTGC TGAACCCTCA 1800 ACTATTGACTGCTGCAGACA GCTTATCTCC CTAACCCTGA AATGCTGTGA GCTTGACTTG 1860 ACTCCCAACCCTACCATGCT CCATCATACT CAGGTCTCCC TACTCCTGCC TTAGATTCCT 1920 CAATAAGATGCTGTAACTAG CATTTTTTGA ATGCCTCTCC CTCCGCATCT CATCTTTCTC 1980 TTTTCAATCAGGCTTTTCCA AAGGGTTGTA TACAGACTCT GTGCACTATT TCACTTGATA 2040 TTCATTCCCCAATTCACTGC AAGGAGACCT CTACTGTCAC CGTTTACTCT TTCCTACCCT 2100 GACATCCAGAAACAATGGCC TCCAGTGCAT ACTTCTCAAT CTTTGCTTTA TGGCCTTTCC 2160 ATCATAGTTGCCCACTCCCT CTCCTTACTT AGCTTCCAGG TCTTAACTTC TCTGACTACT 2220 CTTGTCTTCCTCTCTCATCA ATTTCTGCTT CTTCATGGAA TGCTGACCTT CATTGCTCCA 2280 TTTGTAGATTTTTGCTCTTC TCAGTTTACT CATTGTCCCC TGGAACAAAT CACTGACATC 2340 TACAACCATTACCATCTCAC TAAATAAGAC TTTCTATCCA ATAATGATTG ATACCTCAAA 2400 TGTAAGATGCGTGATACTCA ACATTTCATC GTCCACCTTC CCAACCCCAA ACAATTCCAT 2460 CTCGTTTCTTCTTGGTAAAT GATGCTATGC TTTTTCCAAC CAAAAAAAAA AAAAAAAAAA 2520 563 aminoacids amino acid single linear protein NO 3 Met Gly Leu Gln Ala Cys LeuLeu Gly Leu Phe Ala Leu Ile Leu Ser 1 5 10 15 Gly Lys Cys Ser Tyr SerPro Glu Pro Asp Gln Arg Arg Thr Leu Pro 20 25 30 Pro Gly Trp Val Ser LeuGly Arg Ala Asp Pro Glu Glu Glu Leu Ser 35 40 45 Leu Thr Phe Ala Leu ArgGln Gln Asn Val Glu Arg Leu Ser Glu Leu 50 55 60 Val Gln Ala Val Ser AspPro Ser Ser Pro Gln Tyr Gly Lys Tyr Leu 65 70 75 80 Thr Leu Glu Asn ValAla Asp Leu Val Arg Pro Ser Pro Leu Thr Leu 85 90 95 His Thr Val Gln LysTrp Leu Leu Ala Ala Gly Ala Gln Lys Cys His 100 105 110 Ser Val Ile ThrGln Asp Phe Leu Thr Cys Trp Leu Ser Ile Arg Gln 115 120 125 Ala Glu LeuLeu Leu Pro Gly Ala Glu Phe His His Tyr Val Gly Gly 130 135 140 Pro ThrGlu Thr His Val Val Arg Ser Pro His Pro Tyr Gln Leu Pro 145 150 155 160Gln Ala Leu Ala Pro His Val Asp Phe Val Gly Gly Leu His His Phe 165 170175 Pro Pro Thr Ser Ser Leu Arg Gln Arg Pro Glu Pro Gln Val Thr Gly 180185 190 Thr Val Gly Leu His Leu Gly Val Thr Pro Ser Val Ile Arg Lys Arg195 200 205 Tyr Asn Leu Thr Ser Gln Asp Val Gly Ser Gly Thr Ser Asn AsnSer 210 215 220 Gln Ala Cys Ala Gln Phe Leu Glu Gln Tyr Phe His Asp SerAsp Leu 225 230 235 240 Ala Gln Phe Met Arg Leu Phe Gly Gly Asn Phe AlaHis Gln Ala Ser 245 250 255 Val Ala Arg Val Val Gly Gln Gln Gly Arg GlyArg Ala Gly Ile Glu 260 265 270 Ala Ser Leu Asp Val Gln Tyr Leu Met SerAla Gly Ala Asn Ile Ser 275 280 285 Thr Trp Val Tyr Ser Ser Pro Gly ArgHis Glu Gly Gln Glu Pro Phe 290 295 300 Leu Gln Trp Leu Met Leu Leu SerAsn Glu Ser Ala Leu Pro His Val 305 310 315 320 His Thr Val Ser Tyr GlyAsp Asp Glu Asp Ser Leu Ser Ser Ala Tyr 325 330 335 Ile Gln Arg Val AsnThr Glu Leu Met Lys Ala Ala Ala Arg Gly Leu 340 345 350 Thr Leu Leu PheAla Ser Gly Asp Ser Gly Ala Gly Cys Trp Ser Val 355 360 365 Ser Gly ArgHis Gln Phe Arg Pro Thr Phe Pro Ala Ser Ser Pro Tyr 370 375 380 Val ThrThr Val Gly Gly Thr Ser Phe Gln Glu Pro Phe Leu Ile Thr 385 390 395 400Asn Glu Ile Val Asp Tyr Ile Ser Gly Gly Gly Phe Ser Asn Val Phe 405 410415 Pro Arg Pro Ser Tyr Gln Glu Glu Ala Val Thr Lys Phe Leu Ser Ser 420425 430 Ser Pro His Leu Pro Pro Ser Ser Tyr Phe Asn Ala Ser Gly Arg Ala435 440 445 Tyr Pro Asp Val Ala Ala Leu Ser Asp Gly Tyr Trp Val Val SerAsn 450 455 460 Arg Val Pro Ile Pro Trp Val Ser Gly Thr Ser Ala Ser ThrPro Val 465 470 475 480 Phe Gly Gly Ile Leu Ser Leu Ile Asn Glu His ArgIle Leu Ser Gly 485 490 495 Arg Pro Pro Leu Gly Phe Leu Asn Pro Arg LeuTyr Gln Gln His Gly 500 505 510 Ala Gly Leu Phe Asp Val Thr Arg Gly CysHis Glu Ser Cys Leu Asp 515 520 525 Glu Glu Val Glu Gly Gln Gly Phe CysSer Gly Pro Gly Trp Asp Pro 530 535 540 Val Thr Gly Trp Gly Thr Pro AsnPhe Pro Ala Leu Leu Lys Thr Leu 545 550 555 560 Leu Asn Pro 587 aminoacids amino acid single linear protein NO 4 Met Lys Ser Ser Ala Ala LysGln Thr Val Leu Cys Leu Asn Arg Tyr 1 5 10 15 Ala Val Val Ala Leu ProLeu Ala Ile Ala Ser Phe Ala Ala Phe Gly 20 25 30 Ala Ser Pro Ala Ser ThrLeu Trp Ala Pro Thr Asp Thr Lys Ala Phe 35 40 45 Val Thr Pro Ala Gln ValGlu Ala Arg Ser Ala Ala Pro Leu Leu Glu 50 55 60 Leu Ala Ala Gly Glu ThrAla His Ile Val Val Ser Leu Lys Leu Arg 65 70 75 80 Asp Glu Ala Gln LeuLys Gln Leu Ala Gln Ala Val Asn Gln Pro Gly 85 90 95 Asn Ala Gln Phe GlyLys Phe Leu Lys Arg Arg Gln Phe Leu Ser Gln 100 105 110 Phe Ala Pro ThrGlu Ala Gln Val Gln Ala Val Val Ala His Leu Arg 115 120 125 Lys Asn GlyPhe Val Asn Ile His Val Val Pro Asn Arg Leu Leu Ile 130 135 140 Ser AlaAsp Gly Ser Ala Gly Ala Val Lys Ala Ala Phe Asn Thr Pro 145 150 155 160Leu Val Arg Tyr Gln Leu Asn Gly Lys Ala Gly Tyr Ala Asn Thr Ala 165 170175 Pro Ala Gln Val Pro Gln Asp Leu Gly Glu Ile Val Gly Ser Val Leu 180185 190 Gly Leu Gln Asn Val Thr Arg Ala His Pro Met Leu Lys Val Gly Glu195 200 205 Arg Ser Ala Ala Lys Thr Leu Ala Ala Gly Thr Ala Lys Gly HisAsn 210 215 220 Pro Thr Glu Phe Pro Thr Ile Tyr Asp Ala Ser Ser Ala ProThr Ala 225 230 235 240 Ala Asn Thr Thr Val Gly Ile Ile Thr Ile Gly GlyVal Ser Gln Thr 245 250 255 Leu Gln Asp Leu Gln Gln Phe Thr Ser Ala AsnGly Leu Ala Ser Val 260 265 270 Asn Thr Gln Thr Ile Gln Thr Gly Ser SerAsn Gly Asp Tyr Ser Asp 275 280 285 Asp Gln Gln Gly Gln Gly Glu Trp AspLeu Asp Ser Gln Ser Ile Val 290 295 300 Gly Ser Ala Gly Gly Ala Val GlnGln Leu Leu Phe Tyr Met Ala Asp 305 310 315 320 Gln Ser Ala Ser Gly AsnThr Gly Leu Thr Gln Ala Phe Asn Gln Ala 325 330 335 Val Ser Asp Asn ValAla Lys Val Ile Asn Val Ser Leu Gly Trp Cys 340 345 350 Glu Ala Asp AlaAsn Ala Asp Gly Thr Leu Gln Ala Glu Asp Arg Ile 355 360 365 Phe Ala ThrAla Ala Ala Gln Gly Gln Thr Phe Ser Val Ser Ser Gly 370 375 380 Asp GluGly Val Tyr Glu Cys Asn Asn Arg Gly Tyr Pro Asp Gly Ser 385 390 395 400Thr Tyr Ser Val Ser Trp Pro Ala Ser Ser Pro Asn Val Ile Ala Val 405 410415 Gly Gly Thr Thr Leu Tyr Thr Thr Ser Ala Gly Ala Tyr Ser Asn Glu 420425 430 Thr Val Trp Asn Glu Gly Leu Asp Ser Asn Gly Lys Leu Trp Ala Thr435 440 445 Gly Gly Gly Tyr Ser Val Tyr Glu Ser Lys Pro Ser Trp Gln SerVal 450 455 460 Val Ser Gly Thr Pro Gly Arg Arg Leu Leu Pro Asp Ile SerPhe Asp 465 470 475 480 Ala Ala Gln Gly Thr Gly Ala Leu Ile Tyr Asn TyrGly Gln Leu Gln 485 490 495 Gln Ile Gly Gly Thr Ser Leu Ala Ser Pro IlePhe Val Gly Leu Trp 500 505 510 Ala Arg Leu Gln Ser Ala Asn Ser Asn SerLeu Gly Phe Pro Ala Ala 515 520 525 Ser Phe Tyr Ser Ala Ile Ser Ser ThrPro Ser Leu Val His Asp Val 530 535 540 Lys Ser Gly Asn Asn Gly Tyr GlyGly Tyr Gly Tyr Asn Ala Gly Thr 545 550 555 560 Gly Trp Asp Tyr Pro ThrGly Trp Gly Ser Leu Asp Ile Ala Lys Leu 565 570 575 Ser Ala Tyr Ile ArgSer Asn Gly Phe Gly His 580 585 635 amino acids amino acid single linearprotein NO 5 Met Lys Ile Glu Lys Thr Ala Leu Thr Val Ala Ile Ala Leu AlaMet 1 5 10 15 Ser Ser Leu Ser Ala His Ala Glu Asp Ala Trp Val Ser ThrHis Thr 20 25 30 Gln Ala Ala Met Ser Pro Pro Ala Ser Thr Gln Val Leu AlaAla Ser 35 40 45 Ser Thr Ser Ala Thr Thr Thr Gly Asn Ala Tyr Thr Leu AsnMet Thr 50 55 60 Gly Ser Pro Arg Ile Asp Gly Ala Ala Val Thr Ala Leu GluAla Asp 65 70 75 80 His Pro Leu His Val Glu Val Ala Leu Lys Leu Arg AsnPro Asp Ala 85 90 95 Leu Gln Thr Phe Leu Ala Gly Val Thr Thr Pro Gly SerAla Leu Phe 100 105 110 Gly Lys Phe Leu Thr Pro Ser Gln Phe Thr Glu ArgPhe Gly Pro Thr 115 120 125 Gln Ser Gln Val Asp Ala Val Val Ala His LeuGln Gln Ala Gly Phe 130 135 140 Thr Asn Ile Glu Val Ala Pro Asn Arg LeuLeu Ile Ser Ala Asp Gly 145 150 155 160 Thr Ala Gly Ala Ala Thr Asn GlyPhe Arg Thr Ser Ile Lys Arg Phe 165 170 175 Ser Ala Asn Gly Arg Glu PhePhe Ala Asn Asp Ala Pro Ala Leu Val 180 185 190 Pro Ala Ser Leu Gly AspSer Val Asn Ala Val Leu Gly Leu Gln Asn 195 200 205 Val Ser Val Lys HisThr Leu His His Val Tyr His Pro Glu Asp Val 210 215 220 Thr Val Pro GlyPro Asn Val Gly Thr Gln Ala Ala Ala Ala Val Ala 225 230 235 240 Ala HisHis Pro Gln Asp Phe Ala Ala Ile Tyr Gly Gly Ser Ser Leu 245 250 255 ProAla Ala Thr Asn Thr Ala Val Gly Ile Ile Thr Trp Gly Ser Ile 260 265 270Thr Gln Thr Val Thr Asp Leu Asn Ser Phe Thr Ser Gly Ala Gly Leu 275 280285 Ala Thr Val Asn Ser Thr Ile Thr Lys Val Gly Ser Gly Thr Phe Ala 290295 300 Asn Asp Pro Asp Ser Asn Gly Glu Trp Ser Leu Asp Ser Gln Asp Ile305 310 315 320 Val Gly Ile Ala Gly Gly Val Lys Gln Leu Ile Phe Tyr ThrSer Ala 325 330 335 Asn Gly Asp Ser Ser Ser Ser Gly Ile Thr Asp Ala GlyIle Thr Ala 340 345 350 Ser Tyr Asn Arg Ala Val Thr Asp Asn Ile Ala LysLeu Ile Asn Val 355 360 365 Ser Leu Gly Glu Asp Glu Thr Ala Ala Gln GlnSer Gly Thr Gln Ala 370 375 380 Ala Asp Asp Ala Ile Phe Gln Gln Ala ValAla Gln Gly Gln Thr Phe 385 390 395 400 Ser Ile Ala Ser Gly Asp Ala GlyVal Tyr Gln Trp Ser Thr Asp Pro 405 410 415 Thr Ser Gly Ser Pro Gly TyrVal Ala Asn Ser Ala Gly Thr Val Lys 420 425 430 Ile Asp Leu Thr His TyrSer Val Ser Glu Pro Ala Ser Ser Pro Tyr 435 440 445 Val Ile Gln Val GlyGly Thr Thr Leu Ser Thr Ser Gly Thr Thr Trp 450 455 460 Ser Gly Glu ThrVal Trp Asn Glu Gly Leu Ser Ala Ile Ala Pro Ser 465 470 475 480 Gln GlyAsp Asn Asn Gln Arg Leu Trp Ala Thr Gly Gly Gly Val Ser 485 490 495 LeuTyr Glu Ala Ala Pro Ser Trp Gln Ser Ser Val Ser Ser Ser Thr 500 505 510Lys Arg Val Gly Pro Asp Leu Ala Phe Asp Ala Ala Ser Ser Ser Gly 515 520525 Ala Leu Ile Val Val Asn Gly Ser Thr Glu Gln Val Gly Gly Thr Ser 530535 540 Leu Ala Ser Pro Leu Phe Val Gly Ala Phe Ala Arg Ile Glu Ser Ala545 550 555 560 Ala Asn Asn Ala Ile Gly Phe Pro Ala Ser Lys Phe Tyr GlnAla Phe 565 570 575 Pro Thr Gln Thr Ser Leu Leu His Asp Val Thr Ser GlyAsn Asn Gly 580 585 590 Tyr Gln Ser His Gly Tyr Thr Ala Ala Thr Gly PheAsp Glu Ala Thr 595 600 605 Gly Phe Gly Ser Phe Asp Ile Gly Lys Leu AsnThr Tyr Ala Gln Ala 610 615 620 Asn Trp Val Thr Gly Gly Gly Gly Gly SerThr 625 630 635 20 base pairs nucleic acid single linear other nucleicacid /desc = “Oligonucleotides” NO 6 GTGATCACAG AATGGCACTT 20 20 basepairs nucleic acid single linear other nucleic acid /desc =“Oligonucleotides” NO 7 AACATGGGTT TCCGTAGGTC 20 20 base pairs nucleicacid single linear other nucleic acid /desc = “Oligonucleotides” NO 8CTTCCTCAGG GTCCGCACGG 20 38 base pairs nucleic acid single linear othernucleic acid /desc = “Oligonucleotides” NO 9 TGTAAAACGA CGGCCAGTCAGACCTTCCAG TAGGGACC 38 38 base pairs nucleic acid single linear othernucleic acid /desc = “Oligonucleotides” NO 10 CAGGAAACAG CTATGACCCTGTATCCCACA CAAGAGAT 38 38 base pairs nucleic acid single linear othernucleic acid /desc = “Oligonucleotides” NO 11 TGTAAAACGA CGGCCAGTTAGATGCCATTG GGGACTGG 38 38 base pairs nucleic acid single linear othernucleic acid /desc = “Oligonucleotides” NO 12 CAGGAAACAG CTATGACCGTCATGGAAATA CTGCTCCA 38

What is claimed is:
 1. An isolated CLN2 protein with the followingcharacteristics: a) said CLN2 is a protein with pepstatin-insensitivecarboxyl protease activity; and, b) mutation or absence of said CLN2 iscausative of classical late infantile neuronal ceroid lipofuscinosis(LINCL).
 2. A chimeric protein comprising the CLN2 protein of claim 1.3. A purified nucleic acid encoding CLN2, or a fragment thereof havingat least 15 nucleotides.
 4. The nucleic acid of claim 3 which encodesCLN2 having an amino acid sequence as depicted in FIG. 3 (SEQ ID NO:3).5. The nucleic acid of claim 4 having a nucleotide sequence as depictedin FIG. 3 (SEQ ID NO:1), corresponding allelic genes, homologous genesfrom other species, and nucleotide sequences comprising all or portionsof CLN2 genes which are altered by the substitution of different codonsthat encode the same amino acid residue within the amino acid sequence(SEQ ID NO:3), thus producing a silent change.
 6. The purified nucleicacid of claim 3 which is DNA.
 7. A recombinant DNA expression vectorcomprising the DNA of claim 6, wherein the DNA encoding the CLN2 isoperatively associated with an expression control sequence.
 8. Atransformed host cell comprising the DNA vector of claim
 7. 9. Arecombinant virus comprising the DNA vector of claim
 7. 10. Therecombinant virus of claim 9 selected from the group consisting of aretrovirus, herpes simplex virus (HSV), papillomavirus, Epstein Barrvirus (EBV), adenovirus, and adeno-associated virus (AAV).
 11. A methodfor producing a CLN2 comprising culturing the transformed host cell ofclaim 8 under conditions that provide for expression of the CLN2. 12.The method according to claim 11 wherein the host cell is a bacterium.13. The method according to claim 11 wherein the host cell is amammalian cell.
 14. A method for increasing the level of expression of aCLN2 comprising introducing an expression vector of claim 7 into a hostin vivo under conditions that provide for expression of the CLN2. 15.The method according to claim 14 wherein the expression vector is aviral expression vector.
 16. The method according to claim 14 whereinthe expression vector is a naked DNA expression vector.
 17. A method fortreating LINCL in an animal by increasing the level of CLN2 in cells.18. The method according to claim 17, wherein the level of CLN2 isincreased by administration of CLN2 to the animal.
 19. The methodaccording to claim 18, wherein the level of CLN2 is increased byadministration of a recombinant expression vector to the affected cells,which expression vector provides for expression of the CLN2 in vivo. 20.An oligonucleotide of greater than 20 nucleotides which hybridizes understringent conditions to the nucleic acid of claim 3, wherein the T_(m)is greater than 60° C.
 21. The oligonucleotide of claim 20 which is ananti-sense oligonucleotide.
 22. An antibody specific for CLN2 ofclaim
 1. 23. The antibody of claim 22 which is labeled.
 24. A method fordetecting CLN2 in a biological sample comprising: a) ontacting abiological sample with an antibody of claim 22 under conditions thatallow for antibody binding to antigen; and, b) detecting formation ofreaction complexes comprising the antibody and CLN2 in the sample;wherein detection of formation of reaction complexes indicates thepresence of CLN2 in the sample.
 25. A method for quantitating the levelof CLN2 in a biological sample comprising: a) detecting the formation ofreaction complexes in a biological sample according to the method ofclaim 24; and, b) evaluating the amount of reaction complexes formed;wherein the amount of reaction complexes corresponds to the level ofCLN2 in the biological sample.
 26. A method for measuring CLN2 activityin a biological sample based on the amount of CLN2 pepstatin-insensitivecarboxyl protease activity relative to normal controls in a biologicalsample.
 27. A method for detecting CLN2 in a biological samplecomprising: a) contacting a biological sample with an oligonucleotide ofclaim 21 under conditions that allow for hybridization with mRNA; and,b) detecting hybridization of the oligonucleotide to mRNA in the sample;wherein detection of hybridization indicates the presence of CLN2 in thesample.
 28. A method for quantitating the level of CLN2 in a biologicalsample comprising evaluating the quantity of oligonucleotide hybridizedaccording to the method of claim 26, wherein the quantity ofoligonucleotide hybridized corresponds to the level of CLN2 in thebiological sample.
 29. A method for detecting the CLN2 gene, and mutantvariants associated with LINCL, in chromosomal samples comprising of: a)contacting a chromosomal sample from, for example, amniotic fluid, withan oligonucleotide of claim 21, or variants of said oligonucleotide thathybridize to mutant alleles of CLN2, under conditions that allow forhybridization; and, b) detecting hybridization of the oligonucleotide tothe chromosomes in the sample; wherein detection of hybridization isused a method of prenatal screening for LINCL.
 30. A method foridentification of lysosomal proteins based on the presence of mannose6-phosphate glycosylation and comprising the following steps: a)purifying proteins from a biological sample using an affinity columnconsisting of the mannose 6-phosphate receptor immobilized on a solidsupport; b) peptide sequencing of selected purified proteins; c)designing nucleic acid probes based on the peptide sequences derived instep b; and, d) using the probes of step c to isolate and characterizethe genes encoding the purified lysosomal proteins.